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INTRODUCTION 


When  a  cancerous  lump  is  detected  early  in  the  breast,  the  patient  may  elect  to 
undergo  the  less  traumatic  treatment  of  lumpectomy.  The  treatment  involves  removing 
the  cancerous  lesion  while  leaving  most  of  the  breast  intact.  Although  the  surfaces  of  the 
excised  tissue  are  normally  checked  for  signs  of  cancer  to  ensure  that  the  lesion  was 
completely  removed,  approximately  1  in  5  of  these  patients  suffer  from  recurrence  of  the 
disease.  Therefore,  lumpectomy  alone  frequently  does  not  render  the  patient  disease 
free.  We  hypothesize  that  diseased  cells  exist  outside  the  histologically  identifiable 
border  of  the  biopsied  lesion,  most  likely  in  the  form  of  individual  or  small  groups  of  cells 
that  have  extended  from  the  primary  lesion.  These  cells  could  eventually  create  new 
secondary  cancer  foci.  The  alternative  hypothesis  is  that  the  disease  is  really 
multicentric  and  exists  as  a  system  of  independent  non-connected  (neither  physically  nor 
genetically)  foci.  This  project  uses  3D  digital  microscopy  for  analyzing  tissue  structure  at 
multiple  scales  and  in  situ  genetic  analysis  to  recognize  normal  from  diseased  cells  on 
an  individual  basis.  By  combining  these  two  techniques,  we  will  be  able  to  measure  the 
spatial  distribution  of  genetically  aberrant  cells  versus  normal  appearing  cells  beyond  the 
leading  edge  of  the  intraductal  lesions.  These  cancer  cells  lying  in  the  lumen  of 
morphologically  normal  ducts  could  be  easily  overlooked  using  traditional  histology 
staining.  The  distribution  of  cancer  cells  will  help  us  understand  the  spreading 
mechanism.  Answering  this  question  may  help  us  to  predict  before  surgery  which 
patients  will  suffer  recurrence,  which  patients  need  additional  treatment  following 
lumpectomy  to  avoid  recurrence,  and  provide  valuable  information  in  the  search  for  new 
treatments. 

In  this  project  we  will  use  computerized  microscopy  to  locate  a  lesion  inside  a 
duct  and  to  trace  in  3  dimensions  the  ducts  branching  from  it.  Then,  using  the  technique 
described  above  we  will  detect  the  abnormal  cells  in  the  normal  looking  ducts  extending 
from  the  lesion.  Analysis  of  the  spatial  pattern  of  abnormal  cells  will  tell  us  if  isolated  or 
small  groups  of  cancer  cells  are  present  or  if  some  other  spreading  mechanism  is  taking 
place,  and  how  far  the  spreading  is  from  the  lesion. 
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BODY 


Our  accomplishments  for  the  entire  funding  period  (08/01/00-07/31/04)  will  be 
described  following  the  Tasks  enumerated  in  the  approved  proposal. 

As  mentioned  in  last  year’s  “final"  report,  the  technology  developments  required  for 
this  grant  were  done  under  the  join  budget  of  this  and  another  grant,  “Three-dimensional 
computer-based  mammary  gland  reconstruction  for  measurement  of  the  patterns  of 
hormone  receptor  expression  during  mammary  development'  (DAMD1 7-00-1 -0306).. 
That  is  why  the  technology  accomplishments  reported  here  (mainly  Tasks  1  and  2)  are 
similar  to  those  described  in  the  other  grant’s  report. 

Note:  this  report  corresponding  to  the  one-year  no  cost  extension  requested  last 
year  has  been  written  by  completing  the  “final”  report  of  the  grant  sent  in  August,  2003. 
We  have  kept  the  statements  relative  to  the  accomplishments  done  during  the  first  three 
years,  adding  those  obtained  during  the  one-year  no-cost  extension. 

Task  1.  (Months  1-12)  Modify  an  existing  microscopic  imaging  system  for  acquiring  low 
magnification  (1  pixel=  5  pm)  images  of  entire  tissue  sections  and  for  tracing  in  3D 
the  ducts  in  the  tissue  specimen  from  a  series  of  images  of  adjacent  sections. 

1.  Complete  the  existing  JAVA  based  software  for  interactive  marking  and  3D 
virtual  rendering  of  ducts  so  that  it  allows  any  branching  pattern.  (Months  1-6) 

2.  Interface  the  existing  acquisition  and  registration  software  with  the  JAVA 
application  to  allow  revisiting  of  acquired  slides  for  inspection  and  high-resolution 
acquisition  of  areas  of  interest.  (Months  6-12) 

At  the  end  of  the  entire  funding  period,  we  can  present  R3D2,  a  robust  JAVA 
based  software  that  can  be  used  to  semi-automatically  image  and  reconstruct  tissue 
structures  from  serially  sectioned  thin  (5  pm)  tissue  sections.  This  system  controls  a  fully 
automated  scanning  microscope  and  can  automatically  acquire  entire  sections  stained 
for  either  fluorescence  (e.g.  DAPI)  or  bright  field  (e.g.  H&E)  microscopy.  The  system 
acquires  and  tiles  together  the  multiple  single  field-of-view  images  that  cover  the  entire 
extent  of  the  section,  adjusting  the  focus  plane  of  the  microscope,  when  required  to 
maintain  optimum  contrast  of  the  acquired  images.  All  the  related  sections  that  make  up 
a  tissue  block  are  stored  following  a  predefined  directory  structure  and  can  be  loaded 
and  browsed  easily.  To  allow  real-time  loading  and  browsing  of  these  extremely  large 
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images  (some  of  them  can  reach  70Mb)  we  have  used  memory  mapping  techniques  that 
permit  accessing  the  large  image  files  without  loading  the  entire  image  pointer  in  the 
computer  memory.  Visualization  of  the  entire  section  is  done  by  displaying  reduced 
version  of  the  original  imaged,  subsampled  to  fit  the  size  of  the  visualization  window. 
Then,  zooming  in  areas  of  the  section  is  done  by  retrieving  the  image  data  directly  from 
the  image  file,  based  on  image  coordinates  selected  in  the  subsampled  version  of  the 
image  of  the  entire  section. 

Image  acquisition  can  be  done  in  grayscale  using  the  12  bit  CCD  camera 
attached  to  the  microscope,  or  in  color  by  sequentially  acquiring  grayscale  images  using 
the  proper  excitation/emission  filters  (fluorescence)  or  an  RGB  tunable  liquid  crystal  filter 
(brightfield).  Visualization  of  12  bit  images  in  a  8  bit  per  color  depth  display  can  be  done 
by  linear  compression  or  truncation  of  the  image  data.  In  addition,  images  can  be  linearly 
stretched  to  enhance  low  contrast  images. 

To  complement  its  tissue  imaging  capabilities,  R3D2  provides  interactive  tools  for 
registering  images  of  consecutive  sections,  which  is  necessary  to  faithfully  render  in 
three  dimensions  tissue  structures  traversing  multiple  sections.  This  is  critical,  due  to  the 
significant  missregistration  between  sections,  consequence  of  the  entire  manual 
sectioning  process.  In  fact,  besides  linear  effects  (shifting  and  rotation),  there  are 
frequent  non-linear  effects  such  as  tissue  folding,  shrinking,  stretching  or  tearing  which 
make  registration  extremely  difficult.  Within  the  scope  of  this  grant,  we  have  solved  and 
automated  (see  bellow)  the  linear  registration  problem,  but  correcting  non-linear  effects 
is  one  of  the  future  tasks  that  we  would  like  to  address  in  the  near  future. 

R3D2  is  equipped  with  annotation  tools  (active  pencil,  selection,  grouping, 
splitting,  etc)  that  can  be  used  to  manually  delineate  and  connect  (within  and  between 
sections)  tissue  structures  (in  our  application  normal  ducts  and  lobulo-alveolar 
structures,  DCIS  tumors,  areas  of  invasive  carcinoma,  etc),  that  can  be  rendered  in  3D 
using  a  modified  Delaunay  triangulation  and  the  surface  rendering  OpenGL-based  toolkit 
embedded  in  Java3D.  As  shown,  Java  ensures  seamless  continuity  between  image 
acquisition,  annotation  and  reconstruction  (rendering).  The  3D  rendering  of  the  tissue  is 
interactive,  in  that  the  entire  tissue  “scene”  can  be  seen  from  different  view  points,  at  the 
distance  and  angle  of  the  user’s  choice.  Also,  specific  tissue  “volumes”  can  be  selected 
to  retrieve  information. 

Finally,  R3D2  allows  revisiting  the  original  tissue  slides  from  both  the  images  of 
the  entire  sections  or  from  the  3D  reconstruction  of  the  tissue.  Revisiting  can  be  uses  to 
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visually  inspect  the  slide  under  the  same  of  different  optical  conditions  (i.e.  lense 
magnification,  light-filtering,  etc)  although  is  normally  used  in  to  acquire  multiple-color 
areas  of  interest  at  high  magnification. 

A  detailed  description  of  the  system,  extending  what  has  been  summarized 
above,  has  been  published  in  the  journal  “Microscopy  Research  and  Technique” 
(Bibliography  J1  -Appendix  1-),  and  before  publishing,  presented  in  International 
conferences  (Bibliography  PI,  A1,  A2).  Some  news  agencies  covered  and  distributed 
new  releases  that  were  published  in  several  newspapers  and  online  news  services 
(Bibliography  01,  02,  03,  04). 

Following  the  described  developments,  and  in  the  process  of  applying  R3D2  to 
the  reconstruction  entire  tissue  biopsies,  it  became  clear  that  new  technical 
developments  were  necessary  to  increase  the  throughput  of  the  system,  since  the  time 
required  to  reconstruct  one  case  using  the  existing  mostly  manual  tools  was  in  the  order 
of  two  months.  The  two  main  bottlenecks  in  the  process  described  above  are  the  manual 
registration  and  annotation  of  the  sections.  Therefore  we  developed  automated  tools  for 
both  tasks,  which  have  been  integrated  in  the  system  and  are  currently  being  used, 
providing  substantial  time  saving. 

We  developed  a  multiscale,  multiresolution  registration  algorithm  based  on 
gradient  correlation  between  consecutive  image  sections.  See  Appendix  3  (Bibliography 
P3)  for  a  full  description  of  the  algorithm.  What  follows  is  a  brief  description  of  it.  The 
algorithm  calculates  the  optimum  rigid  body  transformation  (rotation  plus  translation) 
between  each  two  consecutive  images.  To  reduce  the  computational  cost,  we  start  using 
heavily  subsampled  images,  and  refining  the  registration  using  decreasingly  subsampled 
images.  At  each  iteration  level,  the  registration  is  calculated  as  follows:  first,  the  gradient 
of  both  images  (reference  and  registering)  is  calculated  and  thresholded  using  an 
adaptive  threshold.  Then  the  distance  transform  of  the  images  is  obtained,  that  contains 
the  distance  of  each  point  to  the  nearest  gradient  area  (i.e.  boundary).  Finally  the 
distance  transform  corresponding  to  the  image  being  registered  is  scanned  over  that  of 
the  reference  image,  for  different  rotations  and  translations,  and  the  optimum  registration 
is  defined  as  the  absolute  minimum  of  the  product  of  both  images,  which  ideally 
corresponds  to  0,  i.e.,  to  perfect  overlap  between  both  sections.  This  method  has  been 
presented  at  an  international  conference  (Bibliography  A3)  and  accepted  for  platform 
presentation  at  another  (Bibliography  P3). 
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To  address  the  second  bottleneck,  which  is  the  annotation  of  histological 
structures  we  used  partial  differential  equation  (PDE)  morphologically  driven  flows  (i.e. 
Level  Set  methods).  See  Appendix  2  (Bibliography  J3)  for  a  full  description  of  the 
method,  or  read  the  summary  in  the  following  paragraphs. 

A  description  of  the  Level  Set  (LS)  methodology  is  out  of  the  scope  of  this  report. 
Therefore,  a  very  succinct  user-focused  is  provided  next.  In  a  nutshell,  the  LS  approach 
considers  the  image  as  a  force  or  energy  field  determined  by  one  or  a  combination  of 
selected  image  features  (e.g.  intensity,  gradient,  object  curvature,  distance...).  Then  the 
segmentation  of  objects  is  done  by  letting  some  initial  seeds  manually  placed  on  the 
original  image  evolve  under  the  driving  force  of  a  velocity  function  that  depends  on  the 
energy  field.  This  way,  assumed  that  the  right  energy  field  is  selected,  the  curves 
(surfaces  in  3D)  that  define  boundaries  of  the  seeds  will  converge  in  or  near  the 
boundaries  of  the  objects  that  one  wants  to  extract. 

The  initial  curve  is  represented  here  as  the  zero  level  set  of  a  higher  dimensional 
function,  and  the  motion  of  the  curve  is  embedded  within  the  motion  of  that  higher 
dimensional  function.  The  speed  of  that  motion  is  defined  based  on  the  characteristics  of 
the  image  to  be  segmented.  In  our  case  the  speed  is  adjusted  so  that  when  the  interface 
is  on  top  of  areas  with  a  low  gradient  it  expands  quickly,  whereas  when  the  gradient  is 
large  (indicating  the  location  of  an  edge  in  the  image)  the  curve  is  slowed  down.  In 
addition,  a  surface  tension  term  is  included  in  the  speed  function  to  slightly  retard  or 
accelerate  the  contour  depending  on  its  curvature,  thereby  preserving  the  smoothness  of 
the  advancing  front.  This  approach  offers  several  advantages.  First,  the  zero  level  set  of 
the  higher  dimensional  function  is  allowed  to  change  topology  and  form  sharp  corners. 
Second,  geometric  quantities  such  as  normal  and  curvature  are  easy  to  extract  from  the 
hypersurface.  Finally,  everything  expands  directly  to  three  dimensions  if  we  embed  the 
advancing  three-dimensional  surface  as  the  zero  level  set  of  a  four-dimensional  function. 

However,  changing  an  n-dimensional  problem  into  one  in  n+1  dimensions 
increases  the  computational  cost  associated  with  the  method.  The  narrow-band 
approach  accelerates  the  level  set  flow  by  updating  the  position  of  the  curve  only  in  a 
narrow  vicinity  of  its  current  location.  But  in  our  experience,  the  narrow  band  technique 
does  not  reduce  the  computational  cost  to  a  reasonable  limit,  due  to  the  large  size  of  the 
images  of  the  sections.  Thus,  we  propose  to  use  the  fast  marching  method,  a  numerical 
technique  to  solve  the  equation  that  drives  the  movement  of  the  curve  by  combining  an 
efficient  -constrained-  solution  to  the  equation  of  the  movement  of  the  front,  narrow- 
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band  level  set  methods  and  a  min-heap  data  structure.  This  method  is  only  used  for 
monotonically  advancing  fronts  (speed  always  positive  or  negative),  providing  a  result 
very  fast,  albeit  not  as  accurate  as  the  one  obtained  by  using  the  level  set  algorithm. 
This  result  is  then  used  as  the  initial  condition  for  the  slower  but  more  accurate  level  set 
segmentation. 

We  have  used  this  approach  to  segment  histological  structures  on  H&E  and 
fluorescent  (counterstained)  sections  (See  Appendix  2,  Figs.  5,6,7,10  for  some  results) 
The  results  have  been  published  this  year  in  the  peer  reviewed  conference  proceedings 
of  the  SPIE  Biomedical  Optics  Conference  in  San  Jose,  CA,  (Bibliography  P2)  and  a  full 
paper  describing  the  method  and  results  has  been  published  in  a  special  issue  on  Breast 
Cancer  of  the  Journal  of  Biomedical  Optics.  (Bibliography  J3  -Appendix  2-). 

New  tools  have  been  also  developed  during  this  last  year,  that  have  to  do  with 
the  annotation  of  high-resolution  images  of  areas  of  immunostained  sections  and  with 
the  spatial  analysis  of  cellular  events  in  tissue  sections.  On  one  hand,  new  interactive 
tools  have  been  incorporated  to  R3D2  to  allow  annotation  of  molecular  status  in  images 
of  immunostained  cells.  The  tools  allow  annotating  each  individual  nucleus  of  an  image, 
assigning  them  a  level  of  protein  expression  (Negative,  Low,  High)  relative  to  one  or 
more  than  one  (when  using  double  or  triple  immunostaining)  protein.  There  are  other 
tools  to  delete  the  annotations,  change  them,  etc,  along  with  tools  to  mask  and  classify 
areas  of  the  images.  We  have  used  these  tools  (see  below)  to  classify  epithelial  areas  of 
the  mammary  gland  of  mice,  as  being  growing  end  buds,  characteristic  of  pubertal  gland 
development,  small  or  large  ducts  or  alveolar  structures.  After  classifying  the  areas,  a 
comparative  analysis  of  the  expression  of  proteins  can  be  done,  to  compare  the 
expression  in  different,  morphologically  distinct  areas  of  the  gland,  (see  below).  In 
addition,  tools  have  been  added  to  study  the  spatial  statistics  of  cellular  distribution  (and 
pattern  of  protein  expression)  in  the  high  resolution  areas.  These  areas  will  be  used  to 
determine  patterns  of  expression  (grouping,  rejection,  etc)  between  cells  expressing 
specific  proteins,  or  between  cells  expressing  or  not  several  proteins  (when  using 
multiple  immunostaining).  See  Appendix  4  (Bibliography  P4)  for  a  detailed  description  of 
the  spatial  analysis.  Our  method  has  been  accepted  for  platform  presentation  at  an 
international  conference  (Bibliography  P4),  and  will  be  submitted  shortly  to  a  peer- 
reviewed  journal. 

Finally,  we  have  done  work  to  improve  the  results  of  the  registration.  As  it  has 
already  been  mentioned,  the  manual  tissue  sectioning  can  produce  non-linear 
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deformations  in  the  tissue,  such  as  folding,  stretching,  tearing.  Occasionally,  due  to  the 
tissue  conditions  or  improper  maintenance  of  the  microtome,  some  sections  are 
damaged  beyond  recovery  and  are  disposed,  introducing  gaps  in  the  sequence  of 
sections  that  make  up  the  case.  All  these  effects  may  cause  large  misalignment  between 
areas  of  the  section  that  cannot  be  corrected  for  solely  by  applying  the  global  affine 
transformation  previously  described.  Many  non-rigid  registration  methods  have  been 
already  proposed  in  the  literature.  Specific  solutions  based  on  complex  transformations, 
such  as  elastic  registration  or  piecewise  registration,  are  too  expensive  in  computation 
time  given  the  dimensions  of  the  images  of  our  histological  sections.  Therefore  we  opted 
for  a  local  registration  solution  that  can  produce  accurate  results  in  a  reasonable  time. 
Our  local  registration  algorithm  divides  the  reference  image  in  sub-images  and 
calculates  a  correction  vector  for  each  sub-image.  The  correction  vector  is  calculated 
using  the  cross-correlation  between  the  sub-image  in  the  reference  image  and  the 
correspondent  sub-image  in  the  target  image.  This  target  sub-image  is  defined  from  the 
rigid  registration  parameters,  if  a  previous  rigid-body  registration  has  been  previously 
applied  to  the  image.  After  calculating  the  correlation,  every  pixel  in  each  sub-image  is 
applied  the  appropriate  translation  in  the  x-  and  y-  axis  defined  by  the  correction 
vector. 

The  correlation  between  sub-images  can  be  efficiently  calculated  in  the 
frequency  domain.  The  result  is  two  new  images:  modulus  and  phase.  The  modulus  of 
the  correlation  has  a  peak  located  a  distance  that  corresponds  to  the  translation  of  one 
of  the  images  that  would  cause  the  best  alignment  or  similarity  with  the  other  image. 
Therefore,  the  shift  vector  is  obtained  by  calculating  the  difference  between  the 
coordinates  of  the  center  of  the  image  and  the  coordinates  of  the  brightest  peak  in  the 
modulus  image  .  A  graphical  description  of  this  process  is  shown  in  Figure  3. 

Since  the  dimensions  of  the  sub-images  are  fixed,  parts  of  one  or  more 
structures  of  interest  can  belong  to  different  sub-images,  causing  less  accurate  results 
than  if  the  entire  structures  were  inside  the  same  sub-images.  Therefore  the  algorithm 
makes  use  of  two  different  window  sizes:  the  sub-image  size  and  the  correlation  area 
size.  The  correlation  area  is  the  part  of  the  image  centered  in  the  sub-image  that  is  used 
to  calculate  the  correction  vector.  Its  size  must  be  bigger  than  the  sub-image  size  in 
order  to  avoid  the  previously  mentioned  problem.  Indeed,  this  size  should  equal  the 
estimated  maximum  error  produced  during  the  rigid  registration  in  order  to  correct  the 
misalignment  introduced  in  that  process. 
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As  in  the  rigid  registration  algorithm,  the  local  registration  algorithm  can  be 
applied  to  the  gray  scale  images  or  to  the  binary  image  containing  only  the  segmented 
contours.  Some  problems  due  to  the  quality  of  the  images  such  as  improper  focusing  of 
areas  in  the  image,  different  luminosity,  or  even  foreign  bodies  in  the  tissue  could  affect 
the  value  of  the  correlation.  Therefore,  once  the  list  of  correction  vectors  is  obtained, 
filtering  is  applied  to  the  vectors  to  eliminate  erroneous  vectors.  At  the  end  of  this 
process  there  will  be  a  correction  vector  and  a  correlation  coefficient  for  every  sub¬ 
image.  The  first  filter  recalculates  the  vectors  whose  correlation  coefficient  lies  under  a 
threshold  value,  which  is  obtained  using  the  auto-threshold  function  over  the  coefficient 
list.  These  correction  vectors  presenting  a  low  correlation  coefficient  are  substituted  by 
the  average  of  their  3x3  vector  neighborhood.  Thus,  vectors  with  a  very  low  correlation 
coefficient  will  not  be  taken  into  account.  The  second  and  last  filter  to  be  applied  to  the 
list  of  vectors  consists  in  a  mean  filter,  also  with  a  3x3  kernel,  that  will  allow  smoothing 
the  vector  field  and  reducing  the  effect  of  the  possible  noise.  The  result  of  applying  the 
vectors  to  the  coordinates  before  rendering  the  objects  in  3D  is  more  smooth 
reconstructions,  free  from  errors  due  to  non-linear  effects. 
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Task  2.  (Months  1-30)  Using  invasive  cancer  specimens  with  intraductal  extension  of 
DCIS,  identify  DNA  loci  that  are  amplified  in  a  high  proportion  of  the  cells  of  the  invasive 
lesion  using  CGH. 

1.  Select  9  mastectomy  specimens  following  the  criteria  described  in  the  Methods 
section  of  the  Proposal  body  (Months  1-6) 

2.  Section  and  H&E  stain  mastectomy  specimens  (Months  6-12,  2  specimens; 
Months  12-24,  5  specimens;  Months  24-30,  2  specimens) 

3.  Acquire  sections  using  our  registration  software  (Months  12-24  6  specimens; 
Months  24-30,  3  specimens) 

4.  Reconstruct  the  mammary  ducts  and  identify  the  leading  edge  of  the  intraductal 
component  (Months  12-24,  5  specimens;  Months  24-30,  4  specimens) 

As  it  was  mentioned  in  the  report  written  at  the  end  of  the  first  year,  the 
mastectomy  tissue  originally  reserved  for  this  study  was  not  preserved  properly.  Due  to 
this  fact,  we  looked  for  a  new  source  of  tissue  that  was  found  at  the  UCSF 
Comprehensive  Cancer  Center’s  tissue  repository.  However,  instead  of  full 
mastectomies  where  the  first  task  would  have  been  to  look  macroscopically  for  areas 
likely  to  show  intraductal  extension  of  DCIS,  we  were  given  access  to  paraffin- 
embedded  archived  material.  We  received  specimens  four  (UCSF1,  UCSF2,  UCSF3, 
UCSF4),  consisting  on  one  isolated  tissue  block  per  case,  with  the  corresponding 
histological  evaluation  and  Her2  status,  done  on  the  top  slide  of  the  block.  Although 
none  of  the  blocks  showed  a  transition  from  DCIS  to  normal  tissue  on  the  top  slide,  they 
all  had  areas  of  normal  tissue  and  DCIS.  This  led  us  to  think  that  they  could  have  had 
the  desired  transition.  In  fact,  none  of  them  had  it.  Fig  1  and  2  show  examples  of 
reconstructions  of  two  of  the  cases  (UCSF1  and  UCSF2). 

As  described  in  the  report  corresponding  to  the  second  year,  we  continued 
looking  for  tissue  having  the  desired  normal  epithelium  to  DCIS  transition.  We  asked  Dr. 
Alexander  Borowsky  from  the  UC  Davis  Pathology  Department  and  Center  for 
Comparative  Medicine,  who  gladly  identified  several  tissue  specimens  from  his 
Department,  and  helped  us  identifying  those  areas.  We  are  very  grateful  to  him,  since 
we  finally  have  access  to  the  tissue  needed  for  this  grant.  We  have  received  and 
processed  one  case  (UCD1). 
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Task  3.  (Months  1-30)  Use  FISH  with  probes  to  the  loci  identified  by  CGH 

1 .  Do  CGH  on  the  selected  mastectomy  specimens  (Months  1-6) 

2.  Do  FISH  to  the  two  most  amplified  regions  and  a  normal  part  of  the  genome  (for 
control  purposes)  on  intermediate  sections  to  those  used  for  the  reconstruction  of 
the  ductal  system  (Months  6-12,  2  specimens;  Months  12-24,  5  specimens; 
Months  24-30,  2  specimens) 

The  reason  for  using  CGH  was  to  be  able  to  identify  an  amplified  chromosomal  locus 
that  could  be  used  to  track  pre-malignant,  but  otherwise  morphologically  normal  cells 
that  could  move  ahead  of  the  leading  edge  of  the  intraductal  invasion.  That  option,  which 
would  had  been  possible  due  to  the  abundance  of  tissue,  had  we  used  the  original 
mastectomy  specimens,  was  not  feasible  with  the  single  paraffin-embedded  tissue 
blocks  provided  by  the  UCSF  Cancer  Center  Tissue  Repository,  therefore  all  the  chosen 
cases  were  selected  being  Her2+,  and  we  used  the  amplification  of  that  oncogen  as  a 
possible  marker  to  detect  isolated,  premalignant  cells.  FISH  to  the  Her2  gene,  located  in 
chromosome  17,  and  a  centromeric  probe  for  ctr.  17  (control)  was  done  in  cases 
UCSF1,  UCSF2  and  UCSF3. 

UCD1  and  UCSF4,  due  to  reasons  that  will  be  explained  below,  was  immunostained 
for  the  following  proteins:  Ki67  (proliferation),  Estrogen  Receptor,  Progesterone 
Receptor. 
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Task  4.  (Months  18-36)  Use  high  magnification  (1  pixel=  0.5  pm)  fluorescence 
microscopy,  look  for  individual  cells  with  the  same  amplified  loci  in  the  ducts 
emanating  from  the  DCIS  lesions.  Assuming  that  we  see  genetically  aberrant  cells, 
measure  the  spatial  relationship  of  these  cells  to  the  surrounding  cells  in  order  to 
characterize  the  pattern  of  aberrant  cells  and  thus  to  provide  information  about  the 
spreading  mechanism  of  the  disease. 

1.  Automatically  enumerate  FISH  spots  and  measure  the  spatial  distribution  of 
aberrant  cells  (if  found)  in  the  histologically  normal  ducts  starting  at  the  very  front 
of  the  intraductal  tumor  expansion.  (Months  18-24,  4  specimens;  Months  24-36, 
5  specimens) 

We  proposed  to  use  FISH  to  an  amplified  chromosomal  locus  on  the  tissue  blokes, 
and  use  the  amplification  as  a  marker  of  malignancy  that  could  be  tracked  in  normal 
ducts  connected  to  the  DCIS  lesions.  We  hypothesized  that  some  abnormal  isolated 
cells  could  be  the  front  of  the  advancing  DCIS  front.  These  isolated  cells  would  be  easily 
missed  by  routine  pathological  analysis  of  the  tissue.  An  alternative  hypothesis,  the 
genetic  transformation  of  the  cell  could  be  the  consequence  of  a  field-effect,  and  could 
precede  any  morphological  transformation.  As  described  in  Task  2,  the  transition  was 
not  found  in  any  of  the  first  four  cases  received  (UCSF1,  UCSF2,  UCSF3,  UCSF4). 
However,  we  considered  the  quantification  of  the  amplified  gene  worth  doing,  to 
compare  the  number  of  copies  of  the  gene  in  DCIS  areas  with  those  of  nearby  normal 
ducts  since,  although  unlikely  those  areas  might  be  in  fact  connected  in  the  rest  of  the 
tissue,  outside  the  block. 

After  analyzing  the  tissue  we  did  not  find  abnormal,  amplified  cells  in  normal  ducts 
and  alveoli  located  near  DCIS  tumors.  This  works  against  our  hypothesis,  except  for  the 
fact  that  we  don’t  know  for  certain  that  both  types  of  tissue  are  not  connected  outside  the 
limits  of  the  block.  Another  consideration  to  mention  here  is  that  the  efficiency  of  the 
hybridization  seems  to  depend  on  the  type  of  tissue,  and  it  is  very  difficult  to  optimize  it 
to  work  well  for  both  normal  and  tumor  areas.  The  Cytogenetics  Core  of  the  UCSF 
Cancer  Center,  which  performed  the  hybridization,  provided  staining  that  was  optimized 
for  tumor  areas.  In  consequence  some  normal  areas  of  all  tissue  blocks  presented  no 
signals. 
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Another  argument  against  the  use  of  FISH  to  a  single  amplified  gene  as  a  marker  of 
transformation  is  that  genetic  markers  found  in  a  malignant  cancer  are  not  necessarily 
present  in  preneoplastic  lesions,  as  a  consequence  of  the  clonal  progression  of  the 
disease,  which  evolves  by  replacing  complete  populations  of  cells  by  new,  mutated  cells 
with  proliferate  advantage. 

With  that  same  aim  of  determining  the  events  that  drive  and/or  accompany  breast 
cancer  progression,  we  have  maintained  a  collaboration  with  Prof.  Joe  W.  Gray’s 
laboratory  at  the  UCSF  Cancer  Center.  The  goal  of  this  collaboration  was  to  quantify 
genomic  instability  in  the  progression  from  preneoplasia  to  breast  cancer,  and  correlate 
it  with  the  process  of  critical  telomere  shortening  known  as  telomere  crisis.  For  this,  we 
developed  software  tools  for  the  segmentation  of  nuclei  in  2D  and  3D  as  well  as  tools  for 
quantification  of  FISH  hybridizations.  All  these  tools  are  also  available  now  through 
R3D2.  This  collaboration  is  synergic  with  this  project,  without  overlapping  between  the 
goals  of  both  projects,  because  this  collaboration  looks  at  differences  between  tissue 
classified  as  normal,  hyperplastic,  DCIS  or  1C  but  taken  from  different  tissue  blocks 
instead  of  looking  for  morphological  connections  between  them.  In  addition,  this 
collaboration  looked  at  the  process  of  genomic  instability  and  telomere  shortening,  which 
was  not  the  goal  of  our  project. 

The  results  of  this  collaboration,  recently  published  in  Nature  Genetics  (Bibliography 
J4  -Appendix  5-)  are  indeed  relevant  for  this  project  and  are  included  here  because 
were  obtained  in  part  using  the  tools  developed  under  this  project. 

The  reader  is  referred  to  the  actual  paper  for  a  complete  description  of  the  results, 
that  we  can  summarize  as  follows:  We  used  two  and  three  dimensional  image  analysis 
tools  to  quantify  genomic  instability  and  telomere  length  cell  by  cell  in  situ,  i.e.  in  thick 
tissue  blocks.  We  compared  both  events  in  normal,  hyperplastic  (usual  ductal 
hyperplasia  -UDH-),  in  situ  carcinomas  (DCIS)  and  invasive  carcinomas  (1C).  The 
results  of  our  analysis  show  that  an  important  increase  in  genomic  instability  occurs  in 
the  transition  from  UDH  to  DCIS,  which  coincides  with  critical  telomere  shortening. 
Therefore,  DCIS  would  be  a  low-probability  successful  passage  through  telomere  crisis 
mediated  by  reactivated  telomerase.  We  showed  this  trend  both  in  tissue  and  in  vitro,  by 
using  cell  lines  that  closely  mimic  both  the  process  of  genomic  disarray  and 
chromosome  end  shortening. 

On  a  different  collaborative  study  (Biography  J6),  transgenic  mice  expressing  SV40 
or  HPV16  oncogenes  that  elicit  carcinomas  in  pancreas  and  skin,  respectively,  were 
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rendered  telomerase-deficient.  Absence  of  telomerase  had  minimal  impact  on 
tumorigenesis,  exhibiting  shortened  telomeres  and  phenotypic  abnormalities  in  multiple 
organs.  Analyses  of  chromosomal  aberrations  were  not  indicative  of  telomere 
dysfunction  nor  increased  genomic  instability  in  tumors.  Using  our  quantitative  nuclear 
and  telomere  segmentation  tools  we  compared  telomere  intensity  (i.e.  relative  telomere 
length)  in  biopsies  of  skin  hyperplasia,  dysplasia  and  carcinoma.  Our  study  revealed  that 
telomere  numbers  and  relative  lengths  were  maintained  during  progression,  implicating 
a  means  for  preserving  telomere  repeats  and  functionality  in  the  absence  of  telomerase. 

Encouraged  by  the  results  obtained  in  J5,  we  decided  to  look  at  changes  in  the 
expression  of  markers  between  normal,  preneoplastic  and  neoplastic  neighboring  areas 
using  the  two  cases  left  to  analyze  in  this  project  (UCSF4  and  UCD1).  Due  to  the 
problems  already  mentioned,  we  used  a  different  approach  that  takes  advantage  of  the 
high  number  of  consecutive  sections  to  use  several  markers,  instead  of  limiting 
ourselves  to  a  single  marker.  The  markers  that  we  used  are  not  genetic  but  epigenetic 
and  therefore  not  related  to  a  particular  genetic  event.  This  way  we  can  still  look  at 
preneoplastic  changes  in  the  normal  areas  connected  of  surrounding  the  DCIS  tumors, 
by  combining  the  expression  of  several  markers.  We  used  markers  of  proliferation  (Ki67) 
and  two  markers  of  differentiation  and  hormonal  responsiveness  (PR,  ERa).  What 
follows  is  a  description  of  the  way  the  tissue  was  processed  and  analyzed. 

The  tissue  was  cut  in  5pm-thick  sections  using  a  microtome.  Each  odd-numbered 
section  was  stained  with  hematoxylin  and  eosin  (H&E)  and  mounted  on  a  silane-coated 
glass  slide.  Each  even-numbered  slide  was  antibody-stained  for  the  presence  of  ER-a, 
PR  or  Ki67. 

Accurate  rendition  of  the  architecture  of  the  tumors  and  surrounding  normal 
tissue  requires  the  identification  and  tracking  of  all  epithelial  structures  in  serial  sections 
followed  by  computer-based  rendering.  Using  R3D2,  we  acquired  low-resolution  image 
scans  of  all  the  H&E  and  immunostained  sections.  For  the  low  magnification  scans  we 
used  a  10X  0.5  NA  Zeiss  Fluar  objective,  which  provided  acceptably  low  spherical 
aberration  of  the  field  of  views  and  enough  optical  resolution  to  resolve  all  relevant 
epithelial  structures  in  the  images.  However,  due  to  the  size  of  the  sections  at  full 
resolution,  the  images  were  subsampled  by  a  factor  of  4  in  both  X  and  Y  directions,  for 
an  effective  2.5X  (25  times  when  considering  the  magnification  of  the  microscope  tube) 
magnification.  All  the  images  of  the  H&E  and  immunostained  sections  were  saved  in  ICS 
format  and  stored  as  a  case. 
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The  number  of  sections  per  case  was  approximately  200  sections,  each  section 
using  between  20  and  100MB  of  memory.  Due  to  the  large  image  size,  which  prevents 
from  efficiently  loading  them  into  the  memory  of  the  computer,  the  image  files  were 
mapped  into  the  computer  memory,  thus  actually  loading  in  the  memory  the  parts  of  the 
images  being  displayed  at  full  resolution. 

A  topologically  accurate  rendition  of  the  epithelial  structures  (e.g.  ducts,  TDLU’s, 
tumors)  of  a  case  requires  that  all  the  images  of  the  sections  be  correctly  registered  (i.e. 
aligned),  to  ensure  the  continuity  of  structures  crossing  several  sections.  R3D2  provides 
an  automatic  registration  tool  that  calculates  the  best  matching  between  each  pair  of 
sections  after  rigid-body  transforming  one  of  them  with  respect  to  the  other.  This  linear 
rigid-body  transformation  (i.e.  rotation  +  translation)  is  accurate  enough  for  a  correct 
global  alignment  of  the  sections  and  therefore  to  create  a  topologically  correct  rendition 
of  the  structures.  However,  the  manual  sectioning  process  introduces  non-linear  effects 
(tissue  tearing,  stretching,  folding,  local  artifacts,  etc),  which  are  not  addressed  in  this 
study,  where  we  are  not  as  interested  in  the  smoothness  of  the  final  reconstruction  as 
we  are  in  its  topological  accuracy.  Thus  we  used  R3D2’s  linear  rigid  body  transformation 
tool  to  automatically  align  all  the  sections  of  the  case.  The  algorithm  works  in  batch 
mode  for  the  entire  case,  and  does  not  require  any  human  interaction. . 

After  registration,  selected  epithelial  areas  (ducts,  TDLU’s,  tumors),  were 
manually  delineated  on  the  images  of  the  H&E  stained  sections  and  parts  belonging  to 
the  same  structures  that  lie  in  different  sections  were  manually  joined  to  set  the 
continuity  of  the  structures.  This  process  of  grouping  identifies  each  object  as  a  3D 
volume,  which  can  then  be  rendered  in  3D  using  R3D2’s  triangulation  algorithm. 

The  work  of  the  pathologist  requires  revisiting  areas  of  the  tissue  at  different 
magnification  and  relating  areas  from  H&E  stained  and  immunostained  sections.  When 
more  than  one  slide  is  involved,  this  requires  reloading  the  corresponding  slide(s)  and 
looking  for  the  areas  that  want  to  be  revisited.  Given  the  complexity  of  the  tissue  and  the 
difference  in  magnification,  which  often  requires  switching  back  and  forth  between  a  dry 
and  an  oil-immersion  objective  lens,  this  can  be  rather  cumbersome  and  time 
consuming. 

Our  system  preserves  the  spatial  correspondence  between  the  pixels  of  the  low 
magnification  images  and  the  actual  x/y  coordinates  of  slides  in  the  microscope  stage. 
This  greatly  simplifies  the  process  of  revisiting  the  sections,  which  only  requires  clicking 
on  the  areas  of  the  low  magnification  images  that  need  to  be  revisited.  R3D2  then 
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calculates  the  appropriate  microscope  movement  that  places  it  in  the  desired  area(s). 
This  can  be  used  to  visually  revisit  the  tissue  or  to  acquire  new  images  at 
higher/magnification  (see  below)  or  using  a  different  color  filter  for  multi-color  image 
acquisition.  This  way,  since  the  areas  are  identified  at  the  computer  on  the  acquired 
images  and  not  on  the  actual  slides,  no  switching  between  lenses  is  required,  very  much 
streamlining  the  revisiting  process. 

Using  this  feature,  we  revisited  at  high  magnification  (40X,  Plan  Neo,  1 .39  NA)  all 
the  areas  selected  for  rendering  on  the  H&E  sections.  However,  instead  of  visiting  the 
H&E  sections,  we  revisit  the  contiguous  -properly  registered-  immunostained  sections, 
to  image  the  expression  of  the  markers  (ER,  PR,  Ki67)  in  the  epithelium  of  the  selected 
areas. 

All  areas  were  imaged  in  color  by  consecutively  imaging  three  black  and  white 
images  after  inserting  the  appropriate  color  filter  in  the  light  path  of  the  microscope.  To 
avoid  manual  interaction  we  used  a  crystal  tunable  RGB  filter  controlled  by  R3D2.  When 
the  areas  were  larger  than  a  field  of  view  of  the  microscope  at  40X,  image  scans  were 
taken,  the  way  it  was  done  for  the  low  magnification  images,  although  in  color. 

All  the  images  of  the  areas  were  stored  in  the  case,  related  to  their 
corresponding  sections  and  selected  structures.  The  images  can  be  easily  displayed  by 
clicking  in  the  corresponding  areas  of  the  low-resolution  images.  This  action  opens  a 
new  window  that  displays  the  high-resolution  image  and  provides  access  to  the  analysis 
tools  described  below. 

R3D2’s  image  analysis  tools  were  used  to  quantify  the  presence  and  distribution 
of  molecular  markers  in  all  the  high  magnification  images  of  the  areas  of  the  case. 
Based  on  DAB  staining  or  fluorescence  immunostaining,  the  luminal  cells  stained  for  a 
ER-a,  PR  or  Ki67  were  interactively  classified  as  negative  or  positive.  If  the  areas 
contained  more  than  one  type  of  structure,  masks  were  used  to  classify  them  separately. 
This  way,  all  the  epithelial  fragments  of  the  areas  were  classified  under  one  of  the 
following  categories:  large  duct,  small  duct,  terminal  ductal  lobular  unit  (TDLU),  ductal 
carcinoma  in  situ  (DCIS),  invasive  carcinoma  (1C)  or  abnormal  variants  (hyperplasia)  or 
the  normal  tissue  structures.  After  masking,  the  numbers  of  positive  cells  for  each 
receptor  was  calculated  for  each  type  of  structure  separately.  Figures  5  and  7  show 
some  examples  of  the  different  types  of  structures  selected  on  cases  UCSF4  and  UCD1 
respectively. 
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Figure  4  shows  the  results  of  the  analysis  of  UCSF4,  where  only  a  subset  of  the 
above  mentioned  categories  was  found,  and  therefore  used  for  the  analysis.  The  graph 
shows  a  five  fold  difference  in  the  expression  of  ER  between  large,  mature  ducts  (4.8%) 
and  TDLU’s  (24.9%),  that  can  be  explained  because  the  lobular  areas  are  subject  to 
periodic  cycles  of  proliferation  that  don’t  affect  the  epithelium  of  mature  ducts.  TDLU’s 
also  areas  of  potential  tissue  remodeling  (upon  pregnancy  or  involution),  which  could 
explain  the  high  receptor  levels,  although  during  periods  that  don’t  involve  remodeling 
the  actual  levels  of  ERa  mediated  proliferation  are  lower  than  those  in  large  ducts,  as 
shown  by  the  fact  that  the  levels  of  proliferation,  as  expressed  by  the  number  of  Ki67 
positive  cells  (3.2%)  is  higher  in  large  ducts  than  in  TDLU’s  (0.4%). 

The  comparison  between  the  normal  epithelium  and  shows  both  increased 
proliferation  (10.7%  vs.  3.2%)  and  ERa  expression  (28%  vs.  4.8%)  in  areas  with 
intraepithelial  tumor  expansion  (See  Fig.  5a,  top,  right  area  of  the  duct),  as  seems 
normal  due  to  the  higher  proliferative  activity  of  the  tumor  cells. 

Finally,  we  compared  two  DCIS  areas,  one  of  them  that  still  preserves  some 
resemblance  of  the  pre-tumor  morphology  (Fig.  5c)  vs.  another,  more  advanced  where 
the  original  morphology  has  been  completely  overridden  by  the  expansion  of  the  tumor. 
Both  the  levels  of  proliferation  (10.9%  vs.  14.3%)  and  ERa  expression  (13.1%  vs. 
33.9%)  are  lower  in  the  former  than  in  the  latter,  indicating  that  the  process  of  clonal 
replacement  has  not  been  completed  in  this  area. 

The  levels  of  PR  are  consistently  low  for  all  structure  types,  except  for  high 
expressing  TDLU’s.  Although  not  statistically  significant  due  to  the  low  numbers,  losing 
PR  expression  seems  to  be  the  trend  that  accompanies  clonal  replacement  (see 
abnormal  large  ducts  vs.  normal  large  ducts  and  DCIS  in  TDLU  vs.  pure  DCIS).  The 
phenotype  of  the  DCIS  cells  therefore  is  one  of  relatively  high  ERa  expression  and  no 
PR  expression,  as  opposed  to  normal  structures  that  show  correlation  (the  degree 
depending  on  the  structure  type)  between  ERa  and  PR. 

Figure  6  shows  the  results  for  case  UCD1,  which  contained  most  of  the  tissue 
categories  listed  before  (See  Fig.  7  for  tissue  histology). 

The  comparison  between  the  expression  levels  of  the  normal  large  ducts  (Fig  7a) 
and  the  large  hyperplastic  ducts  (Fig  7b)  shows  drastic  increase  of  ERa  (8.8%  vs. 
77.6%)  and  PR  (11.5%  vs. 39. 51%)  expression  and  a  relatively  large,  statistically 
significant  increase  in  cellular  proliferation  (1.7%  vs.  6.8%)  due  to  progressive  filling  of 
the  duct  with  cells  dominantly  positive  for  both  hormone  receptors.  A  similar  trend  can  be 


19 


seen  when  comparing  small  and  hyperplastic  ducts,  although  in  this  case  both  types 
have  similar  levels  of  proliferation.  This  trend  is  reverted  when  comparing  normal  vs. 
hyperplastic  TDLU,  since  the  receptor  levels  are  lower  in  the  hyperplastic  TDLU  than  in 
the  normal  ones  (15.9%  vs.  29.9%  ERa,  10.8%  vs.  24.6%  PR),  implying  that  in  this 
structure  the  replacing  cells  that  fill  the  lumens  are  preferentially  hormone  receptor 
negative.  Finally,  the  comparison  between  DCIS  and  Invasive  areas  seem  to  indicate 
that  the  phenotype  of  these  cells,  at  least  in  what  has  to  do  with  hormone  receptor  status 
and  proliferation,  is  very  similar  (see  similar  levels  of  ERa,  PR,  Ki67  in  Fig.  6),  and 
different  from  the  phenotype  of  the  hyperplastic  areas,  that  have  slightly  higher  levels  of 
Era,  and  significantly  lower  levels  of  PR.  This  observation  is  strikingly  different  from  what 
we  saw  in  UCSF4,  where  there  was  a  complete  loss  of  PR  in  the  advanced  -neoplastic- 
stage  (DCIS). 
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KEY  RESEARCH  ACCOMPLISHMENTS 

The  main  accomplishments  achieved  from  this  project  are  listed  below.  They  will  be 
completed  during  the  one-year  extension  requested  for  this  project,  justified  in  page  2  of 
this  report. 

•  We  have  developed  a  system  that  allows  semi-automatic  3D  reconstruction  of 
tissue  samples  from  fully  sectioned  tissue  blocks.  The  system  allows  acquisition 
of  entire  tissue  sections  at  low  magnification,  both  in  bright  field  and  fluorescence 
microscope,  registration  of  images  of  consecutive  sections  and  annotation  and 
3D  rendering  of  tissue  structures  (epithelial,  endothelial,  etc).  The  system  allows 
revisting  of  areas  of  the  tissue  at  higher  magnification  from  both  the  3D 
reconstruction  of  the  tissue  as  wells  as  from  the  low  resolution  images  of  the 
sections.  (Bibliography  J1  -Appendix  1-),  PI,  A1,  A2,  A3,  01,  02,  03,  04) 

•  To  solve  the  bottlenecks  of  tissue  imaging  and  reconstruction,  we  have 
developed  tools  that  automatically  register  consecutive  sections  (Bibliography 
A4)  and  for  the  unmanned  segmentation  of  histological  structures  using 
geometrically  driven  image  flows  (Bibliography  P2  -Appendix  2-,  J3  -Appendix 
3-) 

•  We  have  developed  software  that  can  segment  counterstained  nuclei  and 
quantify  FISH  signals  or  gene  expression  from  the  fluorescent  high-magnification 
areas  of  interest. 

•  We  have  successfully  imaged,  reconstructed  and  revisited  at  high  resolution  five 
biopsy  of  tissue  from  a  patient  with  Ductal  Carcinoma  Is  Situ  (DCIS)  of  the 
breast.  (UCSF1,  UCSF2,  UCSF3,  UCSF4,  UCD1).  The  tissue  was  fully 
sectioned  and  alternatively  stained  with  H&E  and  a  nuclear  fluoresecent 
counterstain  (DAPI)  plus  FISH  with  a  probe  against  the  DNA  locus  of  the  erb-b2 
producing  gene  (UCSF1,  UCSF2,  UCSF3).  UCD1,  which  presents  several  areas 
of  transition  normal-DCIS  was  also  alternatively  stained  with  H&E  and 
immunostained  for  the  following  markers  (ki67,  ER,  PR).  UCSF4  was  stained  as 
UCD1. 
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The  analysis  of  UCSF1,  UCSF2,  UCSF3  did  not  reveal  the  existence  of 
transformed  cells  within  the  epithelium  near  DCIS  areas.  However,  as  already 
explained,  it  is  likely  that  there  is  no  physical  connection  between  the  normal 
ducts  and  DCIS  tumors,  in  which  case  we  would  not  expect  to  see  a 
transformation. 

Using  tools  developed  for  this  project,  we  established  or  continued  successful 
collaborations  with  finding  synergistic  to  those  of  this  project.  This  way  we  were 
able  to  quantify  in  situ  the  levels  of  genomic  instability  and  map  a  sharp  increase 
of  instability  in  the  transition  from  usual  ductal  hyperplasia  (UDH)  to  DCIS.  We 
also  located  telomere  crisis  at  that  exact  point  the  progression  of  the  disease, 
proving  the  DCIS  is  a  low  probability  transition  of  one  or  a  few  cells  that  are  able 
to  survive  with  unstable  genome  by  means  of  telomerase  reactivation.  This  study 
has  been  recently  published  in  a  high  impact  scientific  journal  (Bibliography  J4  - 
Appendix  5-) 

Analysis  of  the  last  two  cases,  UCSF4  and  UCD,  shows  clear  clonal  evolution  in 
the  transition  from  normal  to  hyperplasia  and  from  that  to  neoplasia  (DCIS  and 
1C),  associated  with  changes  in  the  expression  of  the  receptors  and  levels  of 
proliferation.  However  interesting  these  differences  are,  which  will  be  used  as 
preliminary  data  for  future  projects,  the  number  of  samples  used  doesn’t  allow  us 
to  extract  general  consequences. 


REPORTABLE  OUTCOMES 


Manuscripts  (published  or  accepted): 

•  “ A  system  for  combined  three-dimensional  morphological  and  molecular  analysis  of 
thick  tissue  samples"  Fernandez-Gonzalez  R.,  Jones  A.,  Garcia-Rodriguez  E.,  Chen 
P.Y.,  Idica  A.,  Barcellos-Hoff  M.H.,  Ortiz  de  Solorzano  C.  Microscopy  Research  and 
Technique  59(6):522-530,  2002.. 

•  “ Recent  advances  in  quantitative  digital  image  analysis  and  applications  in  Breast 
Cancel1’.  Ortiz  de  Solorzano  C.,  Callahan  D.E.,  Parvin  B.,  Costes  S.,  Barcellos-Hoff, 
M.H.  Microscopy  Research  and  Technique  59  (2):  119-127,  2002. 

•  “A  geometric  model  for  image  analysis  in  cytology "  Ortiz  de  Solorzano  C.,  R.  Malladi, 
Lockett  S.  In:  Geometric  methods  in  bio-medical  image  processing.  Ravikanth 
Malladi  (Ed.).  Springer  Verlag  2002,  pp.  19-42. 

•  “Automatic  segmentation  of  histological  structures  in  mammary  gland  tissue 
sections”.  Fernandez-Gonzalez  R.,  Deschamps  T.,  Idica  A.K.,  Malladi  R., 
Ortiz  de  Solorzano  C.  Journal  of  Biomedical  Optics  9(3):445-453,  2004. 

•  “Absence  of  telomerase  has  minimal  effects  in  mouse  models  of  skin  and  pancreatic 
tumorigenesis  caused  by  viral  oncogenes:  evidence  for  telomere  stabilization  in  short 
telomere,  telomerase-deficient  carcinomas”.  David  Argilla,  Koei  Chin,  Mallika  Singh, 
Graeme  Hodgson,  Marcus  Bosenberg,  Carlos  Ortiz  de  Solorzano,  Steven  Lockett, 
Ronald  A.  DePinho,  Joe  Gray,  and  Douglas  Hanahan.  Accepted  for  publication  at 
Cancer  Cell. 

•  “In  situ  analysis  of  genome  instability  in  breast  cancer”.  Chin  K.  *,  Ortiz  de  Solorzano 
C.  *,  Knowles  D.,  Jones  A.,  Chou  W.,  Garcia-Rodriguez  E.,  Wei  R.,  Kuo  W-L.,  Ljung 
B-M.,  Gray  J.W.  *,  Lockett  S.J*.  Nature  Genetics  (*  equal  contributors).  Online 
version:  DOI  10.1038/Ng1409,  2004. 

Manuscripts  (in  preparation): 

•  “Quantitative  Image  Analysis  in  Mammary  Gland  Biology”.  Fernandez-Gonzalez  R., 
Barcellos-Hoff  M.H.,  Ortiz  de  Solorzano  C.  Review  paper  requested  for  a  special 
Journal  of  Mammary  Gland  Biology  and  Neoplasia  issue  on  Quantitative  analysis  of 
the  mammary  gland  (to  be  published  in  December  2004). 
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Conference  Proceedings: 

•  “A  system  for  computer-based  reconstruction  of  3-dimensional  structures  from  serial 
tissue  sections:  an  application  to  the  study  of  normal  and  neoplastic  mammary  gland 
biolog y”.  Fernandez-Gonzalez  R.,  Jones  A.,  Garcia-Rodriguez  E.,  Knowles  D.,  Sudar 
D.,  Ortiz  de  Solorzano  C.  Proceedings  Microscopy  and  Microanalysis’01 .  Microscopy 
and  Microanalysis  7,  Supplement  2,  pp.964-965,  2001 

•  “Automatic  segmentation  of  structures  in  normal  and  neoplastic  mammary 
gland  tissue  sections".  Fernandez-Gonzalez  R.,  Deschamps  T.,  Idica  A.K., 
Malladi  R.,  Ortiz  de  Solorzano  C.,  Proceedings  of  Photonics  West  2003,  Vol.  4964, 
2003 

•  “Automatic  and  segmentation-based  registration  of  serial  mammary  gland  sections”. 
Arganda-Carreras  l„  Fernandez-Gonzalez  R.,  Ortiz  de  Solorzano  C.  Accepted  for  the 
IEEE  Engineering  in  Medicine  and  Biology  Society  (EMBS)  International  Conference, 
San  Francisco  September  2004. 

•  “A  tool  for  the  quantitative  spatial  analysis  of  mammary  gland  epithelium”.  Fernandez- 
Gonzalez  R.,  Ortiz  de  Solorzano  C.  Accepted  for  the  IEEE  Engineering  in  Medicine  and 
Biology  Society  (EMBS)  International  Conference,  San  Francisco  September  2004. 

Presentations: 

•  A  system  for  computer-based  reconstruction  of  3-dimensional  structures  from  serial 
tissue  sections:  an  application  to  the  study  of  normal  and  neoplastic  mammary  gland 
biology.  Microscopy  and  Microanalysis’01,  Long  Beach,  CA  August  5th-9th,  2001. 
Platform  presentation. 

•  "3D  Histo-Pathology:  towards  a  morphological  characterization  of  ductal  carcinoma  in 
situ  of  the  breast "  Annual  Meeting  of  the  American  Association  for  Cancer  Research 
(AACR).  San  Francisco,  CA,  April  4-9,  2002. 

•  “3D  Mammary  Histopathology”  Fernandez-Gonzalez  R.,  Idica  A.  K.,  Ortiz  de  Solorzano 
C.  2003  Mammary  Gland  Biology  Gordon  Research  Conference  .  Roger  Williams 
University,  Bristol,  Rhode  Island,  June  1-6,  2003. 

•  “Automatic  Registration  of  Mammary  Gland  Section  Images”  Arganda-Carreras  I., 
Fernandez-Gonzalez  R.,  Ortiz  de  Solorzano  C.  First  International  Meeting  on  Applied 
Physics  (APHYS  2003),  Badajoz,  Spain,  October  13th-18th,  2003 
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•  “Three-dimensional  heterogeneity  of  the  mammary  gland  and  breast  tumors”.  Ortiz  de 
Solorzano  C.,  Gordon  Research  Conference  on  Mammary  Gland  Biology.  II  Ciocco, 
Italy,  May  2004. 

Informatics: 

•  As  described  in  the  Body  of  the  report  and  in  the  Reportable  Outcomes  sections,  we 
have  developed  and  integrated  new  methods  to  automatically  extract  histological 
information  from  tissue  sections,  as  well  as  morphological  and  molecular  information 
at  the  cellular  level. 

Funding  obtained: 

•  Segmentation  of  Mammary  Gland  Ductal  Structure  Using  Geometric  Methods.  P.I.’s 
Malladi  R.  and  Ortiz  de  Solorzano  C.  Granted  by  the  LBNL  Laboratory  Directed 
Research  and  Development  Program  (LDRD),  in  the  Strategic-Computational  Sub- 
Program.  Period  Oct  2001-  Sept  2004 

•  Characterization  of  Adult  Stem  Cell  Involvement  in  Mammary  Gland  Development. 

PI:  Dr.  Carlos  Ortiz  de  Solorzano  Funded  by:  LBNL  Laboratory  Directed  Research 
and  Development  Program  (LDRD).  Period  Oct  2002-Sept  2004 

•  Three-dimensional  Modeling  of  breast  cancer  progression.  PI:  Dr.  Carlos  Ortiz  de 
Solorzano.  Funded  by:  University  of  California,  Breast  Cancer  Research  Program 
Grant  Number  -  8WB-01 50 

•  "Characterization  of  label-retaining  cells  and  their  niche  in  the  mouse  mammary 
gland"  DOD  Breast  Cancer  Research  Program  Predoctoral  Fellowship.  PI.  Rodrigo 
Fernandez-Gonzalez. 

•  NASA  NSCOR  Program  Project.  PI.  Mary  Helen  Barcellos-Hoff,  Ph.D. 

Funding  applied: 

•  “An  automated  system  for  three-dimensional  Histopathology”.  NIH  R21/R33  Phase 
Innovator  Award,  for  the  PAR:  “Technology  developments  for  Biomedical 
Applications”  PI.  Carlos  Ortiz  de  Solorzano,  Ph.D. 

•  “Biological  Basis  and  Functional  Phenotypes  of  Breast  Density”.  NIH  Program 
Project.  PI:  Thea  Tlsty,  Ph.D. 
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Employment  or  Research: 

•  Due  to  the  successful  performance  of  the  PI  as  a  Scientist  during  the  first  two  years 
of  the  project,  in  2002  he  was  promoted  to  a  Staff  Scientist  Position  at  the  Life 
Sciences  Division,  Lawrence  Berkeley  National  Laboratory  of  the  University  of 
California. 

•  This  grant  has  partially  supported  Mr.  Rodrigo  Fernandez-Gonzalez,  a  Ph.D. 
candidate  in  the  joint  UC  Berkeley-UC  San  Francisco  Program  in  Bioengineering. 
Rodrigo  continues  working  with  me  part  time  as  a  Graduate  Student  Research 
Assistant.  On  January  2003  he  was  granted  a  DOD-BCRP  predoctoral  fellowship 
“Characterization  of  label-retaining  cells  and  their  niche  in  the  mouse  mammary 
gland  “  that  will  support  him  until  the  end  of  his  graduate  work. 

•  Half  way  through  the  reporting  period  (in  January  2002),  Dr.  Umesh  Adiga,  a  Ph.D.  in 
Computer  Sciences,  joined  my  lab  as  a  postdoctoral  fellow  to  work  on  the  image 
analysis  required  for  the  automatic  segmentation  of  nuclei  and  FISH  signals,  as  well 
to  other  image  analysis  and  processing  tools  required  for  this  project.  He  left  the 
laboratory  in  January  2003.  Unfortunately,  his  performance  was  lower  than  expected, 
based  on  his  previous  work  and  references. 

•  In  October  2002  Dr.  Ouahiba  Laribi,  a  Ph.D.  in  Molecular  and  Cell  Biology  joined  the 
lab.  She  has  been  a  phenomenal  help  in  the  last  two  years. 

•  Mr.  Adam  Idica,  an  Integrated  Biology  undergraduate  student  at  UC  Berkeley  worked 
in  this  project  for  three  years.  He  graduated  from  UC  Berkeley  in  May  2003  and  has 
continued  working  in  our  lab  as  Technical  Assistant  until  July  2004,  when  he 
continued  his  studies  towards  a  Medical  Degree.. 

•  Two  other  undergrad  students,  Ms.  Abbey  Hartland  and  Mr.  Shamroze  Khan,  from 
Shasta  College  (Redding,  CA)  and  UC  Berkeley,  have  worked  in  my  lab  as 
undergraduate  technical  assistants  and  have  greatly  contributed  to  produce  the 
quantitative  results  of  this  project. 
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•  CONCLUSIONS 


In  summary,  during  the  administrative  length  of  the  project,  we  have  developed 
the  computer  and  microscopy  platform  that  we  proposed  to  develop.  The  developments 
have  being  slower  than  expected  due  to  the  realization  of  the  need  for  automating  some 
of  the  time  consuming  tasks,  namely  the  registration  and  annotation  of  the  images  of  the 
sections.  The  developments  have  been  or  are  in  the  process  of  being  published  in  peer- 
reviewed  journals  and  have  been  successfully  presented  in  international  conferences. 
We  believe  that  this  advanced  computerized  microscopy  platform,  developed  thanks  to 
the  funding  support  of  this  grant,  will  be  of  use  in  many  future  studies  that  required 
looking  at  molecular  events  at  cellular  level  within  the  tissue  context  where  they  occur. 

Due  to  the  above  mentioned  delay  in  the  technical  developments,  which  was 
accompanied  by  other  factors  listed  in  pg.2  of  this  report,  the  analysis,  which  aims  at 
locating  signs  of  preneoplastic  transformation  in  normal  epithelium  connected  to  DCIS 
lesions,  suffered  a  significant  delay.  Out  of  the  9  proposed  samples,  we  have  processed 
5  tissue  samples.  We  did  not  find  any  signs  of  preneoplastic  transformations  in  the 
normal  epithelium  of  three  of  the  five  cases  processed.  Due  to  the  reasons  described  in 
the  description  of  Task4,  we  modified  the  original  plan,  and  in  the  two  last  cases  we  will 
looked  at  multiple  phenotypic  markers  instead  of  using  a  single  genetic  marker 
(amplification  of  Her2).  Using  markers  for  hormone  receptor  status  (Era,  PR)  and 
proliferation  (Ki67)  we  found  phenotypic  changes  in  different  morphological  areas. 
However  interesting  these  differences  are,  which  will  be  used  as  preliminary  data  for 
future  projects,  the  number  of  samples  used  doesn’t  allow  us  to  extract  general 
consequences. 
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-  V* 


Figure  1.  Reconstruction  of  case  UCSF1.  A)  Low  resolution  (2.5X)  image  of  an  H&E  stained 
section.  B)  3D  reconstruction  of  selected  tumor  bodies  and  normal  areas.  C)  Reduced  version  of 
A,  where  red  lines  indicate  the  structures  rendered  in  B.  D)  High  resolution  (40X)  Area  of  a 
consecutive  section,  counterstained  with  DAPI.  E)  Same  areas  in  D,  hybridized  with  a  FITC 
labeled  centromeric  probe  of  chr.  17.  F)  Same  area  as  in  D,E,  hybridized  with  a  CY3  labeled  probe 
to  the  Her2  (erbb2)  gene  in  chromosome  17. 
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Figure  2.  Reconstruction  of  case  UCSF2.  A)  Low  resolution  (2.5X)  image  of  an  H&E  stained 
section.  B)  3D  reconstruction  of  selected  tumor  bodies  and  normal  areas.  C)  Reduced  version  of  A, 
where  red  lines  indicate  the  structures  rendered  in  B.  D)  High  resolution  (40X)  Area  of  a 
consecutive  section,  counterstained  with  DAPI.  E)  Same  areas  in  D,  hybridized  with  a  FITC  labeled 
centromeric  probe  of  chr.  17.  F)  Same  area  as  in  D,E,  hybridized  with  a  CY3  labeled  probe  to  the 
Her2  (erbb2)  gene  in  chromosome  17. 
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Figure  3:  These  set  of  images  describe  the  calculus  of  the  correction  vector  in  the 
frequency  domain.  Figures  A  and  B  represent  exactly  the  same  part  of  a  tissue  section 
image  but  B  is  shifted  respect  to  A.  Figure  C  is  the  modulus  image  from  applying  the 
correlation  function  to  image  A  (both  entries  to  the  function  are  image  A,  i.e.  this  is  the 
calculus  of  autocorrelation).  Figure  D  shows  the  correlation  result  between  A  and  B. 
Finally,  E  shows  in  yellow  the  correction  vector,  obtained  by  subtracting  the  coordinates 
of  the  brightest  point  of  D  and  the  brightest  point  on. 
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Ki67,ER,PR  in  SS25  (DCIS  Sample) 
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Structure 


Figure  4.  Results  of  the  analysis  of  protein  expression  (Ki67,ER,  PR)  in  the  UCSF4 
DCIS  case 
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Figure  5.  Examples  of  histological  areas  in  the  UCSF4  case.  A)  Normal  large  duct  with 
intraepithelial  invasion  (upper,  right  part  of  the  duct).  B)  Normal  TDLU.  C)  Area  of  DCIS 
invading  a  TDLU.  D)  Areas  of  advanced  DCIS. 
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Figure  6.  Results  of  the  analysis  of  protein  expression  (Ki67,ER,PR)  in  the  UCD1  DCIS 


case. 
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Figure  7.  Examples  of  histological  areas  in  the  UCD1  case.  A)  Normal  large  duct  with.  B) 
Large,  hyperplastic  duct.  C)  Normal  TDLU.  D)  Hyperplastic  TDLU.  E)  Distended  TDLU.  F) 
Area  of  DCIS. 
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ABSTRACT  We  present  a  new  system  for  simultaneous  morphological  and  molecular  analysis 
of  thick  tissue  samples.  The  system  is  composed  of  a  computer-assisted  microscope  and  a  JAVA- 
based  image  display,  analysis,  and  visualization  program  that  allows  acquisition,  annotation, 
meaningful  storage,  three-dimensional  reconstruction,  and  analysis  of  structures  of  interest  in 
thick  sectioned  tissue  specimens.  We  describe  the  system  in  detail  and  illustrate  its  use  by  imaging, 
reconstructing,  and  analyzing  two  complete  tissue  blocks  that  were  differently  processed  and 
stained.  One  block  was  obtained  from  a  ductal  carcinoma  in  situ  (DCIS)  lumpectomy  specimen  and 
stained  alternatively  with  Hematoxilyn  and  Eosin  (H&E),  and  with  a  counterstain  and  fluorescence 
in  situ  hybridization  (FISH)  to  the  ERB-B2  gene.  The  second  block  contained  a  fully  sectioned 
mammary  gland  of  a  mouse,  stained  for  histology  with  H&E.  We  show  how  the  system  greatly 
reduces  the  amount  of  interaction  required  for  the  acquisition  and  analysis  and  is,  therefore, 
suitable  for  studies  that  require  morphologically  driven,  wide-scale  (e.g.,  whole  gland)  analysis  of 
complex  tissue  samples  or  cultures.  Microsc.  Res.  Tech.  59:522-530 ,  2002.  Published  2002  wiiey-uss,  inc f 


INTRODUCTION 

Understanding  complex  biological  systems  requires 
tissue-level  integration  of  information  from  multiple 
sources  such  as  molecular,  physiological,  anatomical, 
and  so  on.  However,  none  of  the  existing  analytical 
methods  in  biology  provides  the  required  level  of  inte¬ 
gration,  in  that  they  either  do  not  account  for  intercel¬ 
lular  variation  (RFLP,  Southern  blots,  microarray 
technologies,  etc.),  or  they  do  it  at  the  expense  of  tissue 
integrity  (e.g.,  flow  cytometry). 

Image-based  cytometry  (IC)  can  provide  molecular  or 
genetic  information  (e.g.,  by  using  fluorescence  in  situ 
hybridization  [FISH]  or  immunohistochemistry  [IHQ]), 
in  fixed  cells  within  their  native  morphological  tissue 
context.  Volumetric,  3D  morphological  information  can 
be  obtained  using  confocal  laser  scanning  microscopy 
(CLSM).  However,  only  relatively  thin  tissue  sections 
(<100  |xm)  can  be  studied,  due  to  light  scattering  and 
refractive  index  mismatch  problems  that  occur  when 
imaging  deeper  in  the  tissue,  and  because  of  practical 
limitations  of  effectively  staining  thicker  sections  by 
IHQ  or  FISH. 

When  analysis  and  integration  of  molecular  and 
morphological  information  at  high  resolution  is  aimed 
on  a  bigger  scale  (e.g.,  tissue  or  a  small  gland),  the  only 
existing  approach  requires  sectioning  the  tissue,  fol¬ 
lowed  by  both  histological  and  molecular  staining  of 
consecutive  tissue  sections.  The  analysis  is  normally 
performed  by  visual  inspection  of  the  sections  under 
the  microscope.  This  approach  greatly  limits  the  extent 
and  accuracy  of  the  analysis,  due  to  the  difficulty  that 
our  visual  system  finds  when  composing  (extrapolat¬ 
ing)  meaningful  3D  information  from  a  series  of  2D 
sections.  Since  only  a  few  colors  (2-4)  can  be  discrim¬ 
inated,  both  in  bright  field  and  fluorescence  micros¬ 


copy,  along  with  other  practical  problems  related  to 
multicolor  IHQ  or  (F)ISH,  we  have  to  do  the  histologi¬ 
cal  and  the  molecular  staining  on  different,  alternative 
sections.  This  further  complicates  the  visual  analysis 
and  integration  of  molecular  and  morphological  infor¬ 
mation. 

To  overcome  these  problems,  we  have  developed  a 
three-dimensional  microscopy  system  that  integrates 
computer  analysis  and  visualization  tools.  These  tools 
automate  or  greatly  reduce  the  amount  of  interaction 
required  for  the  acquisition,  reconstruction,  and  mor¬ 
phologically  directed  analysis  of  thick  tissue  samples. 
Our  system  can  be  used  to  reconstruct  tissue  struc¬ 
tures  and  to  quantitatively  measure  the  presence  and 
spatial  distribution  of  different  molecular  elements 
(e.g.,  genes,  RNAs,  proteins)  in  their  intact  cellular 
environment.  This  tool  is  currently  being  used  to  study 
breast  cancer,  where  heterogeneity  and  three-dimen¬ 
sionality  are  at  the  very  base  of  both  disease  initiation 
and  clonal  progression. 

Our  system  encapsulates  a  three-dimensional  visu¬ 
alization  system  and  an  image  analysis  system.  The 
application  was  developed  using  a  distributed  architec¬ 
ture  (client-server  model),  and  Java  for  writing  the 
graphical  user  interface  (GUI)  so  that  it  can  run  re¬ 
motely  on  any  computer  platform.  The  system  allows 
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acquiring  and  registering  low  magnification  (e.g., 
1  pixel  =  5  pm)  conventional  (bright  field  or  fluores¬ 
cence)  images  of  entire  tissue  sections.  It  can  also  cre¬ 
ate  a  virtual  3D  reconstruction  of  the  tissue  structures, 
from  which  new  areas  of  interest  can  be  revisited  or 
reacquired  at  high  resolution  (e.g.,  1  pixel  =  0.5  |xm) 
and  automatically  analyzed. 

Although  there  are  existing  commercial  and  non¬ 
commercial  software  packages  that  can  separately  per¬ 
form  some  of  those  functions,  the  integration  of  all  of 
them  (multiresolution,  multicolor  acquisition  of  multi- 
field  fluorescence  microscopy  images;  reconstruction  of 
structures  on  interest  in  3D;  molecular  and  morpholog¬ 
ical  3D  analyses;  content-based  image  and  data  storage 
and  retrieval  system  through  the  use  of  “Cases”;  or  a 
series  of  consecutive  images  and  data  belonging  to  a 
given  tissue)  on  a  single  platform  makes  our  system  a 
very  powerful  tool. 

A  comparison  to  similar  systems  shows  that  many  of 
them  offer  a  set  of  independent  programs  running  un¬ 
der  different  interfaces  (IMOD ,  Boulder  Laboratory  for 
3-Dimensional  Fine  Structure),  but  not  a  common  plat¬ 
form  that  integrates  all  of  them.  Most  of  these  systems 
are  not  designed  to  run  on  multiplatform  environments 
(VIDA,  University  of  Iowa;  Trace,  Boston  University), 
some  others  have  special  hardware  requirements  for 
real  time  rendering  (VoxBlast,  Vaytek  Inc.)  and  many 
lack  a  proper  image  management  system  (Imaris,  Bit- 
plane  AG).  In  this  review,  we  describe  our  system  and 
we  illustrate  its  use  by  presenting  the  reconstruction  of 
a  mouse  mammary  gland  and  a  tumor  biopsy  of  a 
patient  with  ductal  carcinoma  in  situ  (DCIS)  of  the 
breast. 

MATERIALS  AND  METHODS 
System  Description 

The  system  is  controlled  by  a  client-server  applica¬ 
tion,  as  shown  in  Figure  1.  The  server  is  a  C  language 
application  that  runs  on  a  computer  (Dell  Inspiron, 
running  Solaris  7  for  Intel)  connected  to  an  Axioplan 
(Zeiss  Inc.,  Germany)  microscope.  The  server  actuates 
all  the  moving  parts  of  the  microscope:  motorized  scan¬ 
ning  stage,  excitation  filter  wheel,  and  arc-lamp  block¬ 
ing  shutter  (Ludl  Electronic  Products  Ltd.,  Hawthorne, 
NY).  It  controls  the  CCD  Microimager  camera  (Xillix 
Techonologies  Corp.,  Richmond,  British  Columbia, 
Canada)  as  well.  The  server  can  perform  basic  opera¬ 
tions,  such  as  acquiring  and  storing  images,  setting  the 
exposure  time  of  the  CCD,  moving  the  stage,  and  op¬ 
erating  the  filter  wheel.  In  addition,  it  has  been  pro¬ 
grammed  to  offer  more  complex  functions,  such  as  au¬ 
tomatically  focusing  the  microscope  or  acquiring  mul¬ 
tiple  field-of-view  images.  To  do  this,  the  server 
receives  each  order  and  divides  it  into  a  set  of  simple 
actions.  For  instance,  to  acquire  a  multiple  field-of- 
view  image,  the  server  asks  for  the  coordinates  of  the 
vertices  of  the  area  to  be  acquired  and  then  automati¬ 
cally  performs  the  required  sequence  of  stage  move¬ 
ments  and  camera  acquisitions.  The  output  is  a  mosaic¬ 
like  image  of  the  area.  The  server  can  do  multi-color 
acquisition  in  fluorescence  and  bright  field,  by  perform¬ 
ing  consecutive  acquisition  using  different  excitation 
filters  and  multi-band  emission  filters. 

Description  of  the  Client  Application.  The  client 
(R3D2)  is  connected  to  the  server  through  UNIX  sock¬ 


ets,  which  are  the  standard  for  Internet-based  commu¬ 
nications.  It  can  send  requests  to  the  server  from  any 
computer  connected  to  the  Internet.  To  obtain  the  max¬ 
imum  benefit  from  this,  R3D2  has  been  written  in 
JAVA  (v.1.2),  so  that  it  can  be  executed  on  many  dif¬ 
ferent  computer  platforms. 

Figure  2  shows  the  R3D2,s  complete  Graphical  User 
Interface  (GUI).  The  interface  is  divided  in  two  distin¬ 
guishable  parts.  One  (rightmost  vertical  panel  in  Fig. 

2)  provides  connections  to  the  server  and  allows  the 
user  to  request  its  services  through  a  user-friendly 
interface.  The  available  actions  can  be  classified  in  two 
groups. 

Basic  control  operations .  These  include  all  the  sim¬ 
ple,  atomic,  operations  provided  by  the  server,  such  as 
setting  the  objective  lens,  changing  the  excitation  filter 
(fluorescence),  setting  the  exposure  time  of  the  camera, 
moving  the  stage  to  the  absolute  origin  of  coordinates 
relative  to  which  all  measurements  are  taken,  opening/ 
closing  the  arc  lamp  blocking  shutter,  and  acquiring  a 
single  image  using  the  current  microscope  settings. 
R3D2  receives  the  image  from  the  server  and  displays 
it,  both  complete  (zoomed  out),  as  well  as  partially  (in 
its  original  resolution).  Only  a  small  part  of  the  full- 
resolution  image  can  be  displayed  at  full  resolution. 
The  zoomed  area  can  be  interactively  selected  by  mov¬ 
ing  a  window  on  the  complete  version  of  the  image  (Fig. 

3) .  Images  can  be  saved  both  in  ICS  (Dean  et  al.,  1990) 
and  JPEG  format.  When  images  are  saved  as  ICS,  all 
the  acquisition  parameters  (objective,  filter,  location  of 
the  image  on  the  slide  are,  etc.)  are  stored  in  the  ICS 
header  file.  JPEG  format,  compressed  or  not,  can  be 
used  as  an  alternative  format  when  the  user  does  not 
plan  any  future  analysis  of  the  images,  and  images  are 
stored  for  exchange,  document  creation,  or  web  pub¬ 
lishing. 

Complex  operations.  These  operations  combine  mul¬ 
tiple  atomic  operations  to  provide  the  following  frtae — 
tionality. 

•  Autofocus:  Automatically  focuses  the  microscope-  by=— 
taking  a  series  of  images  at  different  positions  in  the 

Z  axis  (step  size  =  0.50  pm  for  low  magnification 
images,  0.25  p  for  high  magnification  images)  and 
determining  the  best-focused  image  of  the  series. 
Blur,  due  to  out-of-focus  light,  reduces  image  con¬ 
trast,  which  can  be  detected  using  several  functions. 
Based  on  several  comparisons  described  in  the  liter¬ 
ature  (Firestone  et  al.,  1991;  Groen  et  al.,  1985; 
Santos  et  al.,  1997),  we  selected  an  autocorrelation- 
based  function  introduced  by  Vollath  (1987). 

•  Scan:  Acquires  multiple  field  of  view  images.  The 
system  displays  a  dialogue-panel  where  the  user  can 
specify  the  filter(s)  to  be  used  (in  fluorescence  micros¬ 
copy),  exposure  time(s)  and  the  limits  of  the  area  to 
acquire.  The  limits  can  be  defined  by  its  coordinates 
(when  known)  or  manually,  by  moving  the  micro¬ 
scope  to  the  upper,  lower,  rightmost,  and  leftmost 
points  of  the  area. 

•  Revisit  Point*  When  the  user  clicks  on  a  point  of  the 
image  of  a  previously  acquired  complete  or  partial 
tissue  section,  the  server  moves  the  stage  to  that 
location  on  the  slide  and  takes  an  image  using  the 
current  values  of  objective,  filter,  and  exposure  time. 

•  Revisit  Area*  When  using  this  option,  the  user  is 
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Fig.  1.  Description  of  the  client-server  architecture  of  R3D2.  The 
server  runs  on  a  computer  with  a  microscope  attached,  and  provides 
access  to  the  microscope  functions.  The  client  is  a  JAVA  application, 
which  can  run  on  any  computer  (no  particular  OS  required)  connected 
to  the  server  through  the  Internet  Protocol  (IP).  The  communication 
between  client  and  server  uses  UNIX  sockets.  The  client  provides 
user-friendly  access  to  all  the  microscope  functions  offered  by  the 
server,  and  allows  handling  of  sets  of  related  images  (Cases)  for 
storage,  annotation,  and  3D  reconstruction  of  structures  of  interest. 
(Color  figure  can  be  viewed  in  the  online  issue,  which  is  available  at 
www.interscience.wiley.com.] 


asked  to  draw  a  rectangle  on  one  image  of  a  previ¬ 
ously  acquired  section.  The  selected  area  will  be  then 
acquired  with  the  microscope  settings  provided  by 
the  user.  Multicolor  area  acquisition  is  an  option  as 
it  is  for  scanning  complete  sections. 

The  second  part  of  the  interface  (two  left  vertical 
panels  in  Fig.  2)  expands  the  system  capabilities,  by 
allowing  creating  and  handling  sets  of  related  images, 
which  we  call  Cases.  A  Case  is  a  sequence  of  low- 
magnification  images  of  complete  tissue  sections  taken 
from  a  tissue  block,  along  with  all  the  areas  re-visited 
on  them  at  higher  resolution  and  with  different  filters, 
plus  the  results  of  analysis  performed  on  them  if  any. 
The  image  files  that  make  up  a  Case  are  specially 
labeled  for  convenience.  The  user  can: 

•  Annotate  the  Cases,  by  marking  and/or  delineating 
structures  of  interest  and  linking  them  within  and 
between  consecutive  sections.  The  user  can  add  tex¬ 
tual  annotations  (Text),  ductal  structure  identifiers 
(with  a  unique  number  that  identifies  them  within 
the  section,  Duct)  and  forms  that  delineate  irregu¬ 
larly  shaped  ducts  or  other  structures  (Shapes).  In 
addition  to  this,  ductal  marks  can  be  connected 
within  the  same  section  or  in  different  sections  (Con¬ 
nected  Ducts),  and  corresponding  shapes  in  different 
sections  can  be  grouped  (Groups). 

•  Register  acquired  sections.  Before  reconstructing  a 
Case  in  3D,  all  its  sections  must  be  registered  to 
ensure  proper  alignment  of  the  elements  that  will  be 
later  reconstructed.  For  that,  we  calculate  the  Rigid- 
Body  Transform  that  provides  the  optimum  rotation 
and  translation  between  each  pair  of  sections.  The 
Transform  is  calculated  from  three  pairs  of  points 
interactively  marked  by  the  user  on  each  pair  of 
images  to  be  registered.  Once  the  points  have  been 
marked,  the  software  calculates  the  rotation  and 
translation  (0,  tx,  ty)  needed  to  minimize  the  sum  of 
the  squared  distances  between  all  three  pairs  of  cor¬ 
responding  points,  thus  aligning  both  images.  The 
results  are  stored  in  the  second  image.  This  method 
is  very  accurate  when  the  pairs  of  points  are  spread 
all  along  the  sections.  Reasons  for  small  errors  are 
imprecise  mouse  interaction,  stretching  and/or  com¬ 


Fig.  2.  R3D2’s  Graphical  User  Interface.  The  GUI  is  divided  in  two 
main  parts.  Left:  Display  consecutive  sections  of  a  Case.  It  also 
provides  the  user  with  tools  for  registering  sections,  annotating  the 
images,  connecting  structures  between  sections  (e.g.  mammary  ducts, 
tumor  volumes),  and  reconstructing  the  annotated  images  in  3D. 
Right:  Access  to  the  microscope  related  functions  offered  by  the 
Server.  [Color  figure  can  be  viewed  in  the  online  issue,  which  is 
available  at  www.interscience.wiley.com.! 


pression  of  the  tissue,  and  the  fact  that  some  struc¬ 
tures  used  to  select  pairs  of  corresponding  points 
might  not  be  perpendicular  to  the  sections. 

•  Reconstruct  Cases.  Our  system  reconstructs  the  tis¬ 
sue  structures  by  rendering  the  user  annotations  in 
3D.  Besides  the  obvious  advantage  of  volumetric  tis¬ 
sue  structure  visualization,  the  3D  rendering  is 
linked  to  the  microscope  for  revisiting,  and  to  the 
original  images  and  their  analysis  for  display  of  the 
images  and  analysis  results,  as  will  be  described 
later.  The  3D  reconstruction  part  of  the  software  has 
been  developed  using  Java3D  (v.  1.2.1.)  This  is  an 
application-programming  interface  (API)  for  3D  in 
Java.  After  asking  the  user  for  the  range  of  sections 
to  render,  the  system  converts  the  coordinates  of  all 
the  markings  in  that  range  of  sections  from  two- 
dimensional  to  three-dimensional  values.  The  sec¬ 
tion  number  and  thickness,  along  with  the  distance 
between  sections,  determines  the  depth-coordinate. 
Then  a  3D  scene  is  built  using  several  geometric 
shapes  to  represent  the  different  markings.  Duct 
markings  are  rendered  as  spheres  and  connected 
with  lines  within  and  between  sections.  Contour 
Shapes  are  rendered  as  volumes  after  applying  a 
refined  Delaunay  triangulation,  using  the  Nuages  re¬ 
construction  software  (INRLA,  France,  http^/www-sop. 
inria.fr/prisme/)  (Boissonnat  et  Geiger,  1993).  The 
3D  rendition  of  the  Case  is  displayed  in  a  new  win¬ 
dow  where  the  mouse  can  rotate  the  scene,  zoom  in 
or  out  or  translate  the  scene  in  the  X  and/or  Y  direc¬ 
tions.  The  3D  window  includes  a  tool  bar  with  op¬ 
tions  to  select  elements.  By  just  clicking  on  one  ele¬ 
ment  of  the  3D  scene,  the  user  can  get  information 
about  it  (location,  size,  etc.),  load  the  image(s)  that 
contain  that  element,  images  are  displayed  in  a  new 
image  panel,  or  move  the  microscope  to  its  location 
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on  the  slide  for  re-imaging.  Volume  selection  is  han¬ 
dled  by  JAVA  3D.  The  selection  is  performed  by 
tracing  a  “virtual  ray”  from  the  user’s  point  of  view 
(defined  when  rendering  the  Case,  normally  at  a 
point  corresponding  approximately  to  the  position  of 
the  user’s  eye)  and  the  point  where  the  user  clicks  on 
the  screen.  The  selected  volume  will  be  the  first 
object  intersected  by  the  ray  within  the  3D  scene. 
The  user  can  also  hide  or  show  all  the  different 
elements  of  the  scene,  reset  it  to  the  default  view  and 
change  the  scale  in  any  of  the  three  dimensions. 

•  Analyze  Cases.  All  areas  selected  based  on  the  3D 
morphological  reconstruction,  can  be  batch  pro¬ 
cessed  upon  a  user  request.  This  way,  only  those 
areas  selected  based  on  a  particular  morphological 
feature  are  analyzed,  and  not  all  the  tissue  sections, 
thus  reducing  the  amount  of  work  required.  The 
analysis  is  done  by  streamlining  the  selected  images 
to  a  new  process  running  custom-made  image  anal¬ 
ysis  routines  built  on  a  commercial  image  processing 
software  (Scilimage,  TNO,  The  Netherlands).  At  the 
moment,  the  image  analysis  routines  can  segment 
counterstained  nuclei  and  detect  and  quantify  FISH 
probes  or  punctuate-pattemed  expressed  proteins. 
The  image  analysis  algorithms  for  nuclei  and  signal 
segmentation  have  been  described  elsewhere  (Mal- 
pica  et  al.,  1997,  Ortiz  de  Solorzano  et  al.,  1998, 
Malpica  et  al.,  1999).  The  results  of  the  analysis  can 
be  displayed  from  the  3D  rendering  window,  and 
global  measurements  can  be  performed  after  select¬ 
ing  the  volumes. 

An  important  feature  of  R3D2  is  that  all  Case-han¬ 
dling  and  marking  functions  can  be  used  in  parallel  to 
the  functions  that  request  microscope  actions.  There¬ 
fore,  acquiring  a  new  section  can  be  done  in  parallel  to 
any  other  Case  related  function  (e.g.,  registering  al¬ 
ready  acquired  sections  or  annotating  the  images).  Our 
implementation  of  this  feature  uses  Solaris  threads. 
Threads  permit  executing  multiple  parallel  copies 
of  a  program  without  multiplying  resource  use.  Each 
thread  shares  memory  and  other  resources  with  other 
threads.  R3D2  runs  on  a  main  thread  and  when  a 
microscope-based  operation  is  selected,  it  launches  a 
new  thread  that  runs  on  the  same  memory  space  as  the 
main  one.  This  scheme  guarantees  that,  in  the  case  of 
a  microscope  failure  or  a  socket  error,  the  system  will 
not  die  abruptly,  as  only  the  thread  working  on  the 
microscope  will  be  affected. 

RESULTS 

We  will  now  illustrate  the  use  of  our  system  by 
showing  how  two  different  tissue  blocks  were  imaged 
and  reconstructed.  The  first  one  (HB)  is  a  tissue  block 
from  the  mammary  gland  of  a  patient  with  ductal  car¬ 
cinoma  in  situ  of  the  breast  (DCIS).  The  second  block 
(MB)  is  a  normal  mammary  gland  of  a  nuliparus 
mouse. 

Tissue  Source 

HB.  The  tissue  was  part  of  a  breast  lumpectomy 
specimen.  After  surgery,  the  specimen  was  fixed  in  an 
alcoholic-formalin  solution,  and  embedded  in  paraffin 
per  routine  at  the  California  Pacific  Medical  Center  in 
1981.  The  tissue  was  originally  staged  as  a  T1N0M0 


Fig.  3.  Single  image  acquisition.  To  acquire  a  single  image,  the 
user  clicks  on  the  “Acquire”  button,  after  choosing  the  appropriate 
microscope  settings:  objective  lens,  excitation  filter  (in  fluorescence), 
and  exposure  time  of  the  CCD  camera.  The  “Autofocus”  option  auto¬ 
matically  finds  the  correct  focus  plane  of  the  microscope  before  ac¬ 
quiring  an  image.  Top:  The  entire  image  (reduced  to  fit  in  the  win¬ 
dow).  Bottom:  Contains  a  zoomed  version  of  the  part  of  the  original 
image  selected  by  the  rectangle  on  the  top  window.  The  rectangle  can 
be  manually  moved  to  look  at  different  parts  of  the  image  at  its 
original,  full,  resolution.  (Color  figure  can  be  viewed  in  the  online 
issue,  which  is  available  at  www.interscience.wiley.com.) 
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Fig.  4.  Tissue  acquisition.  Images  of  full  sections  of  a  tissue  block 
of  ductal  carcinoma  in  situ  of  the  breast  (A,B)  and  of  a  mouse  mam¬ 
mary  gland  (C).  Both  blocks  were  paraffin  embedded,  sectioned,  and 
stained  for  Histology  (H&E,  images  A  and  C)  or  counterstained  with 


(Stage  1).  Several  blocks  of  approximately  lw  X  1”  X 
3  mm  were  taken  from  the  tissue.  The  block  we  used 
contained  DCIS  and  benign  ducts;  there  was  no  inva¬ 
sive  tumor  present.  The  tumor  was  positive  for  ERBB2 
by  immunohistochemistry  performed  on  a  4-|xm  section 
taken  from  the  top  of  the  block. 

MB.  Female  BALB/c  mice  were  obtained  from  Si¬ 
monson  (Gilroy,  CA)  and  housed  4  per  cage  with  chow 
and  water  ad  libitum  in  a  temperature-  and  light- 
controlled  facility.  Carbon  dioxide  inhalation  was  used 
to  kill  the  animals  in  accordance  with  the  Association 
for  Assessment  and  Accreditation  of  Laboratory  Ani¬ 
mal  Care  guidelines  and  institutional  review  and  ap¬ 
proval.  The  4th  inguinal  glands  were  removed  for  his¬ 
tology  and  whole  mounts  at  10  weeks.  The  tissue  was 
formalin-fixed  and  paraffin-embedded. 

Tissue  Preparation 

HB.  The  human  block  was  sectioned  at  4  pim  thick¬ 
ness.  Every  5th  section  was  collected  onto  a  plus  (treated) 
slide,  for  a  total  of  66  sections.  From  this  set,  every 
other  section  was  collected  and  stained  with  H&E 


DAPI  (B).  Microscope  focusing  and  image  acquisition  is  completely 
automated,  as  described  in  the  text.  In  the  individual  images,  thetop 
contains  the  entire  section,  the  bottom  contains  a  zoomed  sub-area 
of  it. 


(33  sections,  40  p,m  apart).  The  rest  of  collected  sections 
were  sent  to  the  UCSF  Cytogenetics  core  for  FISH 
(33  sections,  40  |im  apart),  with  a  probe  for  the  ERBB2 
gene,  which  is  amplified  in  30%  of  carcinomas  of  the 
breast.  The  ERBB2  probe  was  RMC17P077,  a  PI 
probe.  The  DNA  was  extracted  and  labeled  by  Nick 
translation  with  red  CY3  dUTP  fluorochrome.  In  sum¬ 
mary,  the  odd-numbered  sections  were  H&E  stained 
and  the  even-numbered  sections  were  DAPI  counter- 
stained  and  processed  for  FISH. 

MB.  The  mouse  gland  was  paraffin-embedded,  H&E 
stained,  and  sectioned  at  4  jxm.  We  starting  collecting 
every  fifth  section  (20  pirn  apart);  80  |xm  into  the  tissue, 
we  switched  and  collected  all  remaining  sections  (4  pirn 
apart),  for  a  total  of  24  sections.  The  collected  sections 
were  all  placed  on  glass  slides  for  microscopy  and  im¬ 
aging. 

Imaging  of  Tissue  Sections 

HB.  Full  tissue  sections  were  imaged  in  bright  field 
(odd-numbered  sections)  (Fig.  4A),  or  fluorescence 
(even-numbered  sections)  (Fig.  4B),  using  R3D2’s  Scan 
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Fig.  5.  Tissue  annotation  (human  tissue).  Two  consecutive  H&E 
stained  sections  of  a  human  DCIS  Case.  Only  H&E  sections  are  now 
being  shown.  The  images  show  manual  annotations,  namely  ducts 
and  tumor  masses.  Ducts  are  connected  within  and  between  sections. 
Tumor  contours  are  drawn  and  also  connected  between  consecutive 
sections.  Connections  between  sections  are  shown  by  changing  line 
color  of  all  connected  components  every  time  one  of  them  is  visited 
(selected  or  just  traveled  over  by  the  mouse  pointer).  [Color  figure 
can  be  viewed  in  the  online  issue,  which  is  available  at  www. 
interscience.wiley.com.] 


option.  Fluorescent  sections  were  imaged  using  a  sin¬ 
gle-band  filter  block  (360  nm  excitation)  for  DAPI. 

MB.  All  sections  were  imaged  in  bright  field  (Fig. 
40. 

In  both  blocks,  sections  were  imaged  at  10 X  with  a 
Fluar  (0.5  n.a.)  objective  lens  (Zeiss,  Wetzlar,  Germany). 
In  order  to  optimize  memory  use  while  keeping  enough 
resolution  for  histology,  images  were  reduced  by  a  fac¬ 
tor  of  4  in  both  X  and  Y  directions,  which  gave  us  an 
effective  2.5 X  magnification,  i.e.,  a  sixteen-fold  reduc¬ 
tion  in  image  size.  After  image  compression,  the  aver¬ 
age  image  size  of  the  sections  was  25  Mbytes  in  the 
human  block  and  10  Mbytes  in  the  mouse.  The  com¬ 
pression  was  necessary  to  speed  up  image  transmission 
and  display. 

Creation  of  Cases 

Two  Cases  were  created,  initially  empty.  The  ac¬ 
quired  sections  were  added  with  a  number  equal  to  its 
section  number  within  the  tissue.  All  sections  were 
manually  registered  as  previously  described  to  ensure 
proper  alignment  of  the  to-be-done  annotations.  Each 
section  was  registered  to  its  previous  section,  thus 
aligning  all  sections  with  the  first  H&E  section  of  the 
block. 
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Fig.  6.  Tissue  annotation  (mouse  tissue).  Two  consecutive  H&E 
stained  sections  of  the  mouse  mammary  gland  used  to  test  the  system. 
Ducts  are  connected  within  and  between  sections.  Lymph  node  coun¬ 
tours  are  drawn  and  also  connected  between  consecutive  sections. 
Connections  between  sections  are  shown  by  changing  line  color  of  all 
connected  components  every  time  one  of  them  is  visited  (selected  or 
just  traveled  over  by  the  mouse  pointer).  [Color  figure  can  be  viewed 
in  the  online  issue,  which  is  available  at  www.interscience.wiley.com.] 


Tissue  Annotation 

HB.  All  H&E  sections  were  manually  annotated.  We 
marked  the  centers  of  the  ducts  by  placing  a  circular 
mark  (R3D2’s  Duct  tool)  in  the  lumen,  when  the  duct 
was  perpendicular  to  the  image  plane,  or  with  a  line, 
when  the  duct  was  sectioned  longitudinally.  Then  we 
connected  the  markings  within  and  between  consecu¬ 
tive  sections  using  the  Connect  Ducts  option.  We  also 
delineated  tumor  areas  using  R3D2’s  Shapes  option 
and  grouped  their  consecutive  sections  using  the  Group 
option.  (Fig.  5). 

MB.  The  same  procedure  was  followed  for  the  mouse 
except  for  the  Shapes,  which  were  not  used  to  delineate 
tumor  areas,  but  the  lymph  nodes  (Fig.  6). 

3D  Reconstruction 

The  reconstruction  of  the  Cases,  based  on  the  H&E 
sections,  is  shown  in  Figures  7(HB)  and  8  (MB).  All  the 
user  markings  are  interactive,  in  that  by  clicking  on 
them  the  user  can  (1)  obtain  positional  information,  {2) 
load  the  part  of  the  original  image  section(s)  containing 
that  marking,  or  ( 3 )  revisit  the  selected  marking  under 
the  microscope.  For  the  latter,  the  microscope  automat¬ 
ically  moves  to  the  position  of  the  marking  in  the  tis¬ 
sue,  provided  that  the  right  slide  is  on  the  microscope. 
In  situ  tumors  and  lymph  nodes  were  rendered  as 
volumes  in  3D,  and  ducts  as  lines. 
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Fig.  7.  Tissue  reconstruction  (hu¬ 
man  case).  3D  reconstruction  of  the 
manually  annotated  DCIS  Case.  A:  Tu¬ 
mor  volumes  have  been  surface-ren¬ 
dered  to  show  their  three-dimensional 
shapes.  Ducts  are  identified  by  spheres 
and  connected  by  lines  within  and  be¬ 
tween  sections.  Yellow  rectangles  iden¬ 
tify  areas  that  were  re-acquired  at 
higher  magnification.  B:  Close  up  view 
of  a  set  of  connected  markings  corre¬ 
sponding  to  the  same  duct.  The  3D  re¬ 
construction  is  fully  interactive.  It  can 
be  rotated,  zoomed,  and  all  the  elements 
can  be  selected  to  retrieve  information 
about  the  element,  display  the  original 
image  that  contains  the  element,  or  au¬ 
tomatically  move  the  microscope  to  the 
selected  point. 


Fig.  8.  Tissue  reconstruction 
(mouse  Case).  3D  reconstruction  of  the 
H&E  stained,  manually  annotated 
mouse  Case  used  to  test  our  system. 
Lymph  node  volumes  have  been  sur¬ 
face-rendered  to  show  their  three-di¬ 
mensional  shapes.  Ducts  are  identi¬ 
fied  by  spheres  and  connected  by  lines 
within  and  between  sections. 
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Fig.  9.  Revisiting  and  analysis  of  areas  of  interest.  Revisited  area 
of  the  Human  DCIS  tissue  block  used  to  test  the  system.  A:  DAPI 
image  of  counterstained  nuclei;  B:  CY3  images  of  FISH  with  a  probe 
to  the  erb-b2  gene.  Areas  can  be  acquired  and  loaded  from  the  images 
or  the  sections  they  belong  to  or  directly  from  the  3D  reconstruction  of 


Revisit  of  Areas  of  Interest 

HB.  Several  areas  were  revisited  at  high-resolution 
(40  X),  with  a  Plan-Neofluar  (1.3  na)  oil  immersion  lens 
(Zeiss,  Wetzlar,  Germany).  Areas  were  manually  se¬ 
lected  having  either  morphologically  normal  or  abnor¬ 
mal  (DCIS)  areas.  Areas  were  double-scanned  using  a 
Pinkel  filter  set  (Chroma  Technologies,  Brattleboro, 
VT),  by  consecutively  imaging  while  exciting  the  sam¬ 
ple  at  360  (DAPI  emission  from  the  nuclei)  and  572  nm 
(CY3  emission  from  ERBB2  gene).  All  areas  were  ac¬ 
quired  at  full  resolution  and  then  compressed  by  a 
factor  of  2  in  both  X  and  Y  directions  for  an  effective 
20 x  magnification.  Areas  were  manually  selected  by 
drawing  rectangles  on  the  low-resolution  images  of  the 
whole  fluorescent  sections,  although  they  could  have 
been  selected  from  the  3D  reconstruction  as  well.  To 
ensure  proper  alignment  between  the  areas  and  the 
sections,  the  images  of  the  sections  were  previously 
aligned  with  the  microscope  slide  by  calculating  the 
shift  between  the  current  location  of  the  slide  and  its 
location  when  the  image  of  the  section  was  originally 
acquired.  This  is  done  by  calculating  the  rigid  body 
transform  between  three  points  manually  selected  in 
the  image  of  the  section  and  the  corresponding  points 
under  the  microscope.  Figure  7  shows  the  3D  recon¬ 
struction  of  the  human  Case,  incorporating  the  high- 
resolution  acquired  areas,  displayed  as  rectangles.  As 
all  other  elements  in  the  reconstruction  model,  the 
areas  can  be  loaded  or  revisited  by  clicking  on  them 
(Fig.  9).The  total  number  of  areas  imaged  was  160. 

Analysis  of  Areas  of  Interest 

HB.  All  160  areas  were  segmented  overnight  using 
the  Analyze  Case  function.  In  this  particular  case, 


the  tissue.  C:  Results  of  the  segmentation  of  nuclei  based  on  the 
counterstained  channel.  Each  segmented  nucleus  is  colored  differ¬ 
ently.  (Color  figure  can  be  viewed  in  the  online  issue,  which  is  avail¬ 
able  at  www.interecience.wiley.com.] 


DAPI  counterstained  nuclei  were  segmented  using  a 
two-step  algorithm  that  first  uses  an  adaptative 
threshold  to  separate  DNA  areas  from  the  background 
and  then  applies  a  Hough-transform  -I-  Watershed  al¬ 
gorithm  to  separate  clusters  of  nuclei  resulting  of  the 
adaptative  thresholding.  FISH  signals  were  segmented 
using  a  TOP-HAT  morphological  algorithm  followed  by 
morphological  reconstruction.  The  FISH  segmentation 
algorithm  was  applied  only  to  those  areas  identified  as 
nuclei  on  the  DAPI  channel.  A  detailed  description  of 
these  methods  is  out  of  the  scope  of  this  study  and  can 
be  found  in  the  literature  (Malpica  et  al.,  1997,  Ortiz  de 
Solorzano  et  al.,  1998,  Malpica  et  al.,  1999). 

The  total  number  of  nuclei  segmented  was  above 
200,000.  The  results  of  the  nuclear  and  FISH  segmen¬ 
tation  can  be  displayed  for  every  area  (Fig.  9)  from  the 
images  of  the  sections  they  belong  to  or  directly  from 
the  3D  rendering  window. 

DISCUSSION 

We  have  presented  a  new  and  powerful  computer- 
based  system  that  allows  automation  of  the  acquisi¬ 
tion,  storage,  and  analysis  of  thick  sectioned  tissue 
specimens.  The  system  has  been  described  and  demon¬ 
strated  by  reconstructing  tissue  blocks  from  two  tissue 
sources,  processed  using  different  staining  and  micros¬ 
copy  protocols. 

By  integrating  information  from  different  types  of 
staining,  both  histological  (e.g.,  H&E)  and  molecular 
(e.g.,  IHQ  or  FISH),  we  allow  simultaneous  morpholog¬ 
ical  and  molecular  analysis  of  the  specimens.  The  mo¬ 
lecular  analysis  is  not  only  simultaneous,  but  in  fact 
driven  by  the  morphology  as  the  areas  acquired  and 
analyzed  can  be  selected  directly  from  the  3D  recon- 
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struction  of  the  structures  marked  on  the  low-resolu¬ 
tion  images.  This  way,  the  labor-intensive  task  of  ac¬ 
quiring  and  analyzing  the  images  is  done  semi-auto- 
matically — the  slides  still  need  to  be  manually  placed 
on  the  microscope  and  the  areas  to  be  acquired  drawn 
on  the  images  of  the  sections — enormously  reducing 
the  time  and  labor.  In  fact,  large-scale  analysis  can  be 
done  (e.g.,  whole  gland  analysis),  which  would  be  un¬ 
thinkable  otherwise. 

A  classical  problem  of  fluorescence  microscopy, 
which  limits  the  number  of  labeled  elements  to  the 
number  of  fluorochromes  that  can  be  discriminated 
from,  can  be  overcome  by  using  single  or  dual  color 
staining  on  several  adjacent  sections,  provided  that  the 
distance  between  histological  (e.g.,  H&E)  sections  en¬ 
closing  them  allows  detection  of  continuity  between  the 
structures  of  interest. 

Future  work  will  involve  adding  new  functionality  to 
the  system  by  augmenting  the  number  of  analysis  func¬ 
tions  provided  (e.g.,  detection  of  cytoplasmic  or  extra 
cellular  proteins),  and  automating  the  detection  of  the 
structures  of  interest,  which  at  this  point  is  the  most 
time-consuming  task  when  reconstructing  a  Case.  In¬ 
teraction  could  be  further  reduced  by  using  an  auto¬ 
matic  slide  feeder  attached  to  the  microscope,  which 
would  eliminate  the  manual  loading  of  slides  for  revis¬ 
iting  or  acquisition  of  areas. 
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Abstract.  Real-time  three-dimensional  (3-D)  reconstruction  of  epithe¬ 
lial  structures  in  human  mammary  gland  tissue  blocks  mapped  with 
selected  markers  would  be  an  extremely  helpful  tool  for  diagnosing 
breast  cancer  and  planning  treatment.  Besides  its  clear  clinical  appli¬ 
cation,  this  tool  could  also  shed  a  great  deal  of  light  on  the  molecular 
basis  of  the  initiation  and  progression  of  breast  cancer.  We  present  a 
framework  for  real-time  segmentation  of  epithelial  structures  in  two- 
dimensional  (2-D)  images  of  sections  of  normal  and  neoplastic  mam¬ 
mary  gland  tissue  blocks.  Complete  3-D  rendering  of  the  tissue  can 
then  be  done  by  surface  rendering  of  the  structures  detected  in  con¬ 
secutive  sections  of  the  blocks.  Paraffin-embedded  or  frozen  tissue 
blocks  are  first  sliced  and  sections  are  stained  with  hematoxylin  and 
eosin.  The  sections  are  then  imaged  using  conventional  bright-field 
microscopy  and  their  background  corrected  using  a  phantom  image. 
We  then  use  the  fast-marching  algorithm  to  roughly  extract  the  con¬ 
tours  of  the  different  morphological  structures  in  the  images.  The  re¬ 
sult  is  then  refined  with  the  level-set  method,  which  converges  to  an 
accurate  (subpixel)  solution  for  the  segmentation  problem.  Finally,  our 
system  stacks  together  the  2-D  results  obtained  in  order  to  reconstruct 
a  3-D  representation  of  the  entire  tissue  block  under  study.  Our 
method  is  illustrated  with  results  from  the  segmentation  of  human  and 
mouse  mammary  gland  tissue  samples.  ©  2004  Society  of  Photo-Optical  Instru¬ 
mentation  Engineers.  (DOI:  1 0.1 1 1 7/1 .1 69901 1 1 

Keywords:  breast  cancer,  molecular  analysis,  automatic  segmentation,  3-D  recon¬ 
struction,  fast-marching,  level  set. 
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1  Introduction 

The  mature  mammary  gland  is  a  treelike  organ  made  up  of 
three  different  levels  of  branching  ducts  converging  at  the 
nipple.1  The  ducts  are  lined  by  epithelial  cells  and  end  in 
secretory  lobuloalveolar  structures,  which  are  the  sites  of  milk 
production  during  lactation.  In  cancer,  this  hierarchical  orga¬ 
nization  is  disrupted  by  uncontrolled  growth  of  the  epithe¬ 
lium,  and  sometimes  by  the  subsequent  invasion  of  the  sur¬ 
rounding  tissue.2  These  morphological  or  structural  changes 
are  accompanied  by  other  genetic  and  epigenetic  changes  at 
the  cellular  level  (see  Fig.  1).  An  example  of  this  is  seen  in 
ductal  carcinoma  in  situ  (DCIS),  which  is  a  preinvasive  form 
of  cancer.  In  DCIS,  the  molecular  changes  common  in  breast 
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cancer  can  be  observed  together  with  well-defined  morpho¬ 
logical  patterns,  such  as  comedo  or  cribiform  ones,  with  or 
without  central  necrosis.3,4 

An  interesting  problem  is  the  quantification  of  these  ge¬ 
netic  or  molecular  changes  in  the  context  of  the  tissue  envi¬ 
ronment  where  they  occur.  In  this  way,  by  looking  at  cancer 
as  an  organ  that  is  inherently  heterogeneous  and  dynamic,  we 
believe  that  we  will  obtain  a  better  understanding  of  the 
events  that  drive  the  progression  of  the  disease.  Therefore, 
since  the  normal  mammary  gland  and  its  neoplastic  variants 
are  neither  flat  nor  homogeneous,  the  approach  to  this  prob¬ 
lem  should  be  three-dimensional  as  well,  and  take  into  ac¬ 
count  the  heterogeneity  of  the  gland.  However,  most  classic 
methods  in  biology  focus  on  only  a  single  type  of  abnormality 
(molecular  or  morphological)  and/or  neglect  three- 
dimensionality  and  heterogeneity. 

Although  imaging  of  both  tissue  structure  and  function  in 
vivo  would  be  extremely  desirable,  the  existing  in  vivo  imag- 
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Automatic  segmentation  of  histological  structures  .  .  . 


(a)  (b) 


Fig.  1  Morphological  and  genetic  alterations  in  breast  cancer.  The  images  show  two  optical  sections  of  human  mammary  gland  tissue  acquired 
using  a  confocal  laser  scanning  microscope;  nuclei  are  displayed  in  red;  a  probe  for  a  certain  DNA  sequence  in  chromosome  1 7  is  shown  in  green, 
(a)  Normal  tissue;  as  expected,  each  nucleus  contains  up  to  two  green  signals  (up  to  two  copies  of  chromosome  1 7  per  nucleus  in  a  single  optical 
section),  (b)  Neoplastic  lesion;  not  only  do  some  nuclei  contain  more  than  two  copies  of  the  probe,  but  they  also  have  distinct  morphological 
changes  that  can  be  observed  in  this  section. 


ing  methods  (x-ray,  magnetic  resonance  imaging,  optical  to¬ 
mography,  etc.)  do  not  provide  the  necessary  resolution  for 
cell-level  molecular  analysis.  In  addition,  these  methods  pro¬ 
vide  morphological  information  that  can  only  be  indirectly 
related  to  the  function  of  the  tissue.  Consequently,  ex  vivo 
microscopic  analysis  of  the  tissue  from  flat  fixed  sections  is 
the  routine  method  in  histopathology.  However,  the  limited 
ability  of  the  human  eye  to  extrapolate  and  visualize  3-D 
structures  from  sequences  of  2-D  scenes  renders  this  method 
unsuitable  for  quantitative  3-D  tissue  characterization.  To 
overcome  these  issues,  we  have  developed  a  system  for  simul¬ 
taneous  morphological  and  molecular  analysis  of  thick  tissue 
samples.5  The  system  consists  of  a  computer-assisted  micro¬ 
scope  and  a  JAVA-based  image  display,  analysis,  and  visual¬ 
ization  program  (R3D2).  R3D2  allows  semiautomatic  acqui¬ 
sition,  annotation,  storage,  3-D  reconstruction,  and  analysis  of 
histological  structures  (intraductal  tumors,  normal  ducts, 
blood  vessels,  etc.)  from  thick  tissue  specimens.  For  this  pur¬ 
pose  the  tissue  needs  to  be  embedded  in  a  permanent  or  semi¬ 
rigid  medium  after  collection.  The  tissue  is  then  fully  sec¬ 
tioned,  and  the  resulting  sections  are  stained  in  a  way  that 
visually  highlights  the  desired  structures.  In  histopathology, 
hematoxylin  and  eosin  (H&E)  is  the  most  common  combina¬ 
tion  of  dyes  used  to  observe  the  morphology  of  the  tissue.  In 
order  to  image  not  only  structure,  but  also  molecular  events, 
we  alternate  H&E  staining  with  fluorescent  staining  (immun¬ 
ofluorescence,  fluorescence  in  situ  hybridization)  of  proteins 
and  selected  genes  in  consecutive  sections. 

This  paper  focuses  on  the  annotation  of  the  structures  of 
interest  on  the  H&E-stained  sections.  Manual  segmentation 
has  been  used  before  to  delineate  histological  structures.6-8  In 
order  to  build  the  3-D  reconstruction  of  the  block,  we  initially 
annotated  each  of  the  interesting  features  of  the  tissue  on  each 
section  by  using  manually  drawn  contours.  This  step  consti¬ 
tuted  a  bottleneck  in  the  study  of  samples.  Semiautomatic 
approaches  to  segmentation  of  features  of  interest  in  histologi¬ 
cal  sections  have  also  been  used,1*  but  they  still  involve  too 
much  user  interaction  to  be  useful  for  reconstructing  large 
tissue  samples.  Automatic  segmentation  of  the  structures  of 


interest  is  the  answer  to  this  problem.  We  propose  an  auto¬ 
matic  method  followed  by  interactive  correction  that  greatly 
reduces  the  amount  of  interaction  required,  thus  allowing  us 
to  use  our  system  for  imaging  and  reconstruction  of  large, 
complex  tissue  structures. 

Automatic  extraction  of  contours  in  2-D  images  is  usually 
done  with  active  contour  models,  originally  presented  by  Kass 
et  al.10  These  methods  are  based  on  deforming  an  initial  con¬ 
tour  (polygon)  toward  the  boundary  of  the  desired  object  to 
extract  in  an  image.  The  deformation  is  achieved  by  minimiz¬ 
ing  a  certain  energy  function,  which  is  computed  by  integrat¬ 
ing  along  the  contour,  terms  related  to  its  continuity,  and 
terms  related  to  the  pixel  values  of  the  area  of  the  image 
where  the  contour  is  defined.  That  energy  function  approaches 
a  minimum  near  the  object’s  boundary,  and  thus  the  minimi¬ 
zation  process  drives  the  curve  toward  the  desired  shape. 

As  an  alternative,  implicit  surface  evolution  models  have 
been  introduced  by  Malladi  et  al. 11,12  and  Caselles  et  al.13  In 
these  models,  the  curve  and  surface  models  evolve  under  an 
implicit  speed  law  containing  two  terms,  one  that  attracts  it  to 
the  object’s  boundary  and  another  that  is  closely  related  to  the 
regularity  of  the  shape.  Specifically,  the  proposal  is  to  use  the 
level-set  approach  of  Osher  and  Sethian.14  This  is  an  interface 
propagation  technique  used  for  a  variety  of  applications,  in¬ 
cluding  segmentation.  The  initial  curve  is  represented  here  as 
the  zero  level  set  of  a  higher  dimensional  function,  and  the 
motion  of  the  curve  is  embedded  within  the  motion  of  that 
higher  dimensional  function.  An  energy  formulation  similar  to 
the  active  contours  leads  to  a  minimization  process  with  sev¬ 
eral  advantages.  First,  the  zero  level  set  of  the  higher  dimen¬ 
sional  function  is  allowed  to  change  topology  and  form  sharp 
comers.  Second,  geometric  quantities  such  as  normal  and  cur¬ 
vature  are  easy  to  extract  from  the  hypersurface.  Finally,  the 
method  expands  straightforwardly  to  3-D,15  but  adding  a  di¬ 
mension  to  the  problem  increases  the  computational  cost  as¬ 
sociated  with  the  method.  The  narrow-band  approach  of  Ad- 
alsteinsson  and  Sethian16  accelerates  the  level-set  flow  by 
considering  for  computations  only  a  narrow  band  of  pixels 
around  the  zero  level  set.  However,  in  our  experience,  the 
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Fig.  2  Protocol  followed  on  tissue  blocks.  The  different  steps  (sectioning,  annotation,  reconstruction,  high-magnification  acquisition,  and  molecular 
analysis)  are  illustrated.  Samples  are  fully  sectioned  at  5  /xm  (step  1).  The  odd  sections  are  stained  with  H&E,  the  even  ones  with  some  kind  of 
fluorescence  technique  (application  dependent).  Images  are  acquired  of  all  the  sections  (step  2),  and  the  structures  of  interest  are  delineated  in  the 
H&E-stained  ones  (step  3).  A  3-D  reconstruction  of  the  specimen  is  created  from  these  markings  (step  4).  From  the  3-D  reconstruction  of  the  tissue, 
different  areas  can  be  selected  for  molecular  analysis  (step  5).  The  system  will  take  high-magnification  images  of  those  areas  on  the  corresponding 
fluorescent  sections  (step  6).  Image  analysis  tools  can  then  be  used  to  quantify  the  presence  and  distribution  of  molecular  markers  in  the 
high-magnification  images  (step  7). 


narrow-band  technique  does  not  reduce  the  computational 
cost  to  a  reasonable  limit,  owing  to  the  large  size  of  the  im¬ 
ages  of  the  sections. 

Thus  we  propose  to  use  the  fast-marching  method.17  This 
method  considers  monotonically  advancing  fronts  (speed  al¬ 
ways  positive  or  negative),  providing  a  result  very  quickly, 
albeit  one  that  is  not  as  accurate  as  the  one  obtained  by  using 
the  level-set  algorithm.  Malladi  and  Sethian18  showed  that  the 
fast-marching  method  can  be  used  as  the  initial  condition  for 
the  slower  but  more  accurate  level-set  segmentation,  obtain¬ 
ing  real-time  delineation  of  the  structures  of  interest.  A  com¬ 
bination  of  all  these  tools  is  the  framework  we  use  to  recon¬ 
struct  normal  and  cancerous  ducts  in  mammary  gland  tissue 
sections  of  DC1S  samples. 

This  paper  is  organized  as  follows.  Section  2  describes  the 
general  tissue  handling  and  image  acquisition  protocols  that 
we  use,  as  well  as  the  theoretical  basis  of  the  segmentation 
scheme;  Sec.  3  shows  the  results  of  applying  our  method  to 
histological  tissue  sections;  and  Sec.  4  discusses  the  results 
and  suggests  some  future  developments  and  improvements  to 
our  approach. 

2  Methodology 

2.1  Tissue  Processing  and  Imaging 

The  tissue  processing  and  staining  protocol  used  is  illustrated 
in  Fig.  2.  Tissue  blocks  of  4-  to  5-mm  thickness  were  sliced 
into  5  jx m  (thin)  sections  (step  1).  The  odd  sections  were 
stained  with  H&E  to  obtain  morphological  information  at 
both  the  cytological  (single  cell)  and  architectural  (organ)  lev¬ 
els.  The  even  sections  were  stained  using  some  fluorescence 
technique  (e.g.,  immunocytochemistry,  fluorescence  in  situ 


hybridization),  depending  on  the  molecular  phenomena  that 
we  wanted  to  study.  Describing  the  acquisition  and  analysis  of 
the  fluorescent  images  is  out  of  the  scope  of  this  paper,  which 
focuses  on  the  segmentation  of  epithelial  structures  in  the 
H&E-stained  sections.  Therefore  the  rest  of  this  section  de¬ 
scribes  only  the  protocol  used  for  the  H&E-stained  sections. 
Low  magnification  (2.5  X),  panoramic  images  of  all  the  sec¬ 
tions  were  automatically  acquired  using  a  motorized  Zeiss 
Axioplan  I  microscope  coupled  with  a  monochrome  XilliX 
Microimager  CCD  camera  (step  2).  To  create  these  large  pan¬ 
oramic  images,  the  system  scanned  the  entire  section,  taking 
single-field-of-view  snapshots  and  tiling  them  together  into 
single,  whole-view  images  of  the  sections.  The  required  se¬ 
quence  of  microscope  movements  and  camera  operations  is 
produced  by  an  application  running  on  the  Sun  Ultra  10  work¬ 
station  that  controls  the  camera  and  all  moving  parts  of  the 
microscope. 

Next  we  annotated  the  structures  of  interest  (ducts,  lymph 
nodes,  tumors)  in  the  images  of  the  H&E-stained  sections 
(step  3).  These  annotated  structures  were  used  to  produce  a 
three-dimensional  model  of  each  tissue  block  (step  4). 

These  four  initial  steps  are  the  focus  of  this  paper.  How¬ 
ever,  to  better  understand  the  rationale  for  the  process,  we 
briefly  describe  the  final  three  steps,  which  are  out  of  the 
intended  scope  for  this  paper.  The  user  can  choose  to  revisit 
areas  in  the  three-dimensional  rendition  of  the  organ,  based  on 
their  morphology.  This  can  be  done  on  the  H&E  sections  or 
on  the  intermediate  fluorescent  sections.  To  do  so,  the  system 
requires  the  user  to  acquire  high-magnification  (40  to  100X) 
images  of  the  corresponding  section(s)  (steps  5  and  6).  If  the 
high-magnification  images  are  taken  from  the  intermediate 
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(b) 


human  case.  The  background  pattern  created  by  the  acquisition  method  is 


§r>J 


Fig.  3  Background  correction,  (a)  Image  of  a  section  belonging  to  a 
readily  noticeable,  (b)  Same  image  after  background  correction. 

immunofluorescent  sections,  quantification  routines  can  be 
run  on  these  new  images  (step  7). 

Manually  annotating  all  the  relevant  morphological  struc¬ 
tures  in  all  the  sections  of  fully  sectioned  tissue  blocks  is 
feasible  but,  for  all  purposes,  impractical  because  of  the  tre¬ 
mendous  human  effort  needed.  Although  it  might  be  the  most 
accurate  and  reliable  approach,  manual  annotation  is  not  pos¬ 
sible  except  when  reconstructing  small,  very  simple  tissue 
volumes.  As  a  result,  we  have  developed  automatic  methods 
that  eliminate  or  greatly  reduce  human  interaction,  thus  mak¬ 
ing  the  reconstruction  of  complex  systems  possible.  The  fol¬ 
lowing  discussion  describes  our  approach. 

2.2  Segmentation 

2.2.1  Background  removal 

The  algorithm  that  we  use  to  acquire  an  image  of  an  entire 
section  creates  a  mosaic  from  a  set  of  snapshots  (one  per  field 
of  view).  This  approach  gives  rise  to  a  background  pattern 
across  the  image  [Fig.  3(a)]  involving  relatively  large  gradi¬ 
ents  in  between  elements  of  the  mosaic.  The  objects  of  inter¬ 
est  often  span  several  fields  of  view,  and  since  our  segmenta¬ 
tion  approach  depends  largely  on  the  gradients  of  the  image, 
we  need  to  eliminate  the  background  pattern  in  order  to  obtain 
good  segmentation  results.  This  can  be  done  by  performing  a 
set  of  arithmetic  operations  on  the  “mosaic"  image,  known  as 
background  compensation. 

First  we  need  a  phantom,  that  is,  an  image  of  an  empty 
field  of  view  taken  under  the  same  illumination  conditions 
and  microscope  configuration  that  we  used  to  acquire  the  ini¬ 
tial  image.  Since  most  of  the  images  that  we  acquire  have  an 
empty  frame  in  the  upper  left  comer,  it  is  simple  to  choose 
that  frame  as  our  phantom  for  the  corresponding  section.  After 
normalizing  the  pixel  values  in  the  phantom,  and  for  each 
frame  in  the  entire  image,  we  divide  the  value  at  each  pixel  by 
the  value  at  the  corresponding  pixel  in  the  phantom  frame. 
The  resulting  image  is  background-corrected  as  shown  in  Fig. 
3(b),  and  it  is  a  better  input  for  our  segmentation  algorithms. 

2.2.2  Preliminary  segmentation:  fast-marching 
method 

Consider  a  monotonically  advancing  2-D  front  C  with  a  speed 
F  that  is  always  positive  in  the  normal  direction,  starting  from 
an  initial  point  p0 , 


-  =  F„.  (1) 

This  equation  drives  the  evolution  of  a  front  starting  from  an 
infinitesimal  circular  shape  around  p0  until  each  point  p  inside 
the  image  domain  is  visited  and  assigned  a  crossing  time 
U{p ),  which  is  the  time  /  at  which  the  front  reaches  point  p. 

The  gradient  of  the  arrival  time  is  inversely  proportional  to 
the  speed  function,  and  thus  we  have  a  form  of  the  eikonal 
equation 

|V (J\F=  1  and  U(p0)  =  0.  (2) 

One  way  to  solve  Eq.  (2)  is  to  use  upwind  finite-difference 
schemes  and  iterate  the  solution  in  time.15  In  other  words,  the 
scheme  relies  on  one-sided  differences  that  look  in  the  upwind 
direction  of  the  moving  front,  thereby  choosing  the  correct 
viscosity  solution,  namely 

[max(w  —  Uj-ij  ,u—Uj+  ij.O)]2 

,  1 

+  [max(M-  Ujj-t  ,u  —  ULj+ 1,0)]2  =  — — .  (3) 

FiJ 

The  key  to  solving  this  equation  rapidly  is  to  observe  that 
the  information  propagates  from  smaller  values  of  U  to  larger 
values  in  the  upwind  difference  structure  in  Eq.  (3).  The  idea 
is  to  construct  the  time  surface,  one  piece  at  a  time,  by  only 
considering  the  “frontier"  points;  we  detail  the  fast-marching 
method  in  Table  1 . 

Note  that  in  solving  Eq.  (3),  only  alive  points  are  consid¬ 
ered.  This  means  that  for  each  point,  the  calculation  is  made 
using  the  current  values  of  U  at  the  neighbors,  and  not  esti¬ 
mates  at  other  trial  points.  Considering  the  neighbors  of  the 
grid  point  (ij)  in  4-connectedness,  we  designate  {A{,A2} 
and  {/?,  ,Z?2}  as  the  two  couples  of  opposite  neighbors  so  that 
we  get  the  ordering  U(A  i)^  U(A2),  U{B\ )^  U(B2)<  and 
U(A  ,)^t/(Z?i).  Since  we  have  u^U(B {)^U(A  ,),  we  can 
derive 

[«-t/(/j1)]2  +  [«-l/(B,)]2=-2-.  (4) 

FU 

Computing  the  discriminant  (A)  of  Eq.  (4),  we  complete  the 
steps  described  in  Table  2.  Thus  the  algorithm  needs  only  one 
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Table  1  Fast-marching  algorithm. 


Algorithm  for  2-D  Fast  Marching 

•  Definitions: 

Alive  set:  all  grid  points  where  the  action  value  U  has 
been  reached  and  will  not  be  changed; 

Trial  set:  next  grid  points  (4-connectedness  neighbors)  to 
be  examined.  An  estimate  U  of  U  has  been  computed  using 
Eq.  (3)  from  Alive  points  only  (i.e.,  from  U); 

Far  set:  all  other  grid  points,  where  there  is  no  estimate 
for  U  yet; 

•  Initialization: 

Alive  set:  reduced  to  the  starting  point  p0 ,  with  U[p0) 

=  U(po)  =  0; 

Trial  set:  reduced  to  the  four  neighbors  p  of  p0  with 
initial  value  U(p)  =  1/F(p)  ( D(p)  =  *0; 

Far  set:  all  other  grid  points,  with  U= <*; 

•  Loop: 

Let  p  =  ( /’mirv/mm)  be  the  trial  point  with  the  smallest  action 
U; 

Move  it  from  the  trial  to  the  alive  set  (i.e.,  l/(p) 

=  U,  j  is  frozen); 

For  each  neighbor  (/,/)  (4-connectedness  in  2-D)  of 

( fmin 

If  (/,/)  is  far,  add  it  to  the  trial  set  and  compute  U, ,  using 

Eq-  (3); 

If  (/,/)  is  trial,  update  the  action  U,  j  using  Eq.  (3). 


pass  over  the  image  to  find  a  solution.  To  execute  all  the 
operations  that  we  just  described  in  the  minimum  amount  of 
time,  the  trial  points  are  stored  in  a  min-heap  data  structure.1^ 
Since  the  complexity  of  changing  the  value  of  one  element  of 


Table  2  Solving  the  upwind  scheme  locally. 


1 .  •  If  A^O,  u  should  be  the  largest  solution  of  Eq.  (4); 

If  the  hypothesis  u>U[B])  is  wrong,  go  to  2; 

If  this  value  is  larger  than  U(B i),  this  is  the  solution; 

•  If  A<0,  B]  has  an  action  too  large  to  influence  the 
solution.  It  means  that  u>U[B d  is  false.  Go  to  2; 

Simple  calculus  can  replace  case  1  by  the  test: 

If  l/fj (>U|8,) - U(A|),  0=U|B,)  +  U(A,)  +  {21/F?(. 

—  [ L/( B i )  —  U( A])]2}]/'2/2  is  the  largest  solution  of  Eq.  (4) 
else  go  to  2; 

2.  Considering  that  we  have  u<U(B i)  and  u^U(A^),  we 
finally  have  u—  U(A\)  +  l/F/(/ . 


the  heap  is  bounded  by  a  worst-case  processing  time  of 
[0(log2  TV)],  the  total  algorithm  has  a  complexity  of 
0{N  log2  AO  on  a  grid  with  N  nodes. 

Finally,  we  define  the  speed  of  propagation  in  the  normal 
direction  as  a  decreasing  function  of  the  gradient  |V/(„t)|, 
that  is,  a  function  that  is  very  small  fnear  large  image  gradi¬ 
ents  (i.e.,  possible  edges)  and  large  when  the  brightness  level 
is  constant: 

F(.v)  =  exp- a\ V/(jc)|,  a>0,  (5) 

where  a  is  the  edge  strength,  or  the  weight  that  we  give  to  the 
presence  of  a  gradient  in  order  to  slow  down  the  front.  De¬ 
pending  on  this  value,  the  speed  function  falls  to  zero  more  or 
less  rapidly,  and  thus  it  could  stop  a  few  grid  points  away 
from  the  real  edge.  Also,  variations  in  the  gradient  along  the 
boundary  can  cause  inaccurate  results.  False  gradients  caused 
by  noise  can  be  avoided  using  an  edge-preserving  smoothing 
scheme  on  the  image  as  a  preprocessing  step;  see  Ref.  19. 

The  user  can  run  the  fast-marching  method  from  a  given 
set  of  initial  points  (mouse  clicks  on  the  background  of  the 
images).  Alternatively,  the  user  can  decide  to  segment  only 
the  structures  within  a  manually  defined  rectangular  region  of 
interest.  If  the  region  is  too  big,  subsampling  can  be  used  so 
that  the  segmentation  process  is  not  too  slow.  However,  this 
option  must  be  used  carefully,  since  subsampling  smooths  the 
boundaries  of  the  objects  present  in  the  image  and  can  com¬ 
pletely  obliterate  smaller  structures.  The  resulting  contour  (or 
contours  if  we  are  segmenting  several  objects  at  the  same 
time)  will  provide  an  excellent  initial  condition  for  the  level- 
set  method. 

Putting  all  of  these  elements  together,  we  are  able  to  obtain 
a  good  approximation  of  the  shape  of  the  object  that  we  are 
trying  to  segment  (Fig.  5).  In  order  to  improve  the  final  result, 
we  propose  to  run  a  few  iterations  of  the  level-set  method 
using  the  result  of  the  fast-marching  method  as  the  initial 
condition. 

2.2.3  Final  segmentation:  level-set  method 

Once  we  have  obtained  a  good  approximation  of  the  shape  of 
the  object  using  the  fast-marching  algorithm,  we  can  afford  to 
use  the  more  computationally  expensive  level-set  method  to 
improve  the  result  of  the  segmentation.  The  essential  idea 
here  is  to  embed  our  marching  front  as  the  zero  level  set  of  a 
higher  dimensional  function.  In  our  case,  we  take  that  func¬ 
tion  to  be  <f>{. x)  =  ±d ,  where  d  is  the  signed  distance  from  x 
to  the  front  (see  Fig.  4),  assigning  negative  distances  for  pix¬ 
els  inside  the  evolving  curve,  and  positive  distances  for  pixels 
outside  it. 

Following  the  arguments  in  Ref.  14,  we  obtain  the  follow¬ 
ing  evolution  equation: 

^  +  FU^)|V0|  =  O,  (6) 

where  F  is  again  the  speed  in  the  normal  direction.  This  is  an 
initial-value  partial  differential  equation,  since  it  describes  the 
evolution  of  the  solution  on  the  basis  of  an  initial  condition 
defined  as  <£(*,/  =  0)  =  </>() .  As  pointed  out  earlier,  the  level- 
set  approach  offers  several  advantages: 

•  The  zero  level  set  of  the  function  can  change  topology 
and  form  sharp  comers. 
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Fig.  4  Basic  concept  behind  the  level-set  method.  The  marching  front 
is  embedded  as  the  zero  level  set  of  a  higher  dimensional  function. 

•  A  discrete  grid  can  be  used  together  with  finite  differ¬ 
ences  to  approximate  the  solution. 

•  Intrinsic  geometric  quantities  such  as  normal  and  curva¬ 
ture  can  be  easily  extracted  from  the  higher  dimensional 
function. 

•  Everything  extends  directly  to  3-D. 

To  mold  the  initial  condition  (in  our  case  the  result  of  the 
fast-marching  method)  into  the  desired  shape,  we  use  two 
force  terms.  By  substituting  these  two  terms  in  the  motion 
equation18,20  we  get: 

*-g(l-«)|V*|-0VgXV*=O,  €>0,  /3>0, 

(7) 

where  g  is  an  edge  indicator  function  defined  by  the  speed  of 
the  front  [Eq.  (5)]  in  the  eikonal  Eq.  (2);  k  is  the  curvature  of 
the  expanding  front,  and  <f>,  is  the  unknown  that  we  are  trying 
to  compute.  As  before,  the  image  I(x)  can  be  preprocessed 
using  an  edge-preserving  smoothing  scheme. 

The  second  term  of  the  equation  has  two  components.  The 
first  one  slows  the  surface  in  the  neighborhood  of  high  gradi¬ 
ents  (edges),  while  the  second  one  (motion  by  curvature)  pro¬ 
vides  regularity  to  the  curve.  The  parameter  e  is  the  weight  of 
the  motion  by  curvature  term,  and  determines  the  strength  of 
its  regulatory  effect:  the  bigger  we  make  e,  the  more  we  limit 


(a) 


the  possibility  of  obtaining  sharp  comers  and  irregular  con¬ 
tours.  In  practice,  an  intermediate  value  of  e  provides  a  good 
trade  off  between  contour  smoothing  and  accuracy. 

Finally,  the  last  term  of  the  equation  attracts  the  front  to 
the  object's  boundaries.  It  aligns  all  the  level  sets  with  the 
ideal  gradient,  which  would  be  a  step  function  centered  at  the 
point  of  maximum  gradient  in  the  original  image,  is  the 
weight  of  the  advection  of  the  front  by  the  edge  vector  field 
Vg.  It  determines  the  strength  of  the  attraction  of  the  front  to 
the  edges. 

At  times,  for  very  small  objects,  it  is  possible  to  use  the 
level-set  method  from  the  initial  point.  However,  for  any  type 
of  morphological  structure,  we  can  use  the  result  of  the  eiko¬ 
nal  Eq.  (2),  T/(jt,v),  as  the  initial  condition  for  the  level-set 
algorithm:  <f>(x,y;t  =  0)  =  U(x,y).  Then  by  solving  Eq.  (7) 
for  a  few  time  steps  using  the  narrow-band  approach,  we  ob¬ 
tain  an  accurate,  real-time  segmentation  of  the  desired  object 
(see  Fig.  5). 

3  Results 

In  this  section  we  consider  the  problem  of  reconstructing 
DCIS  areas  in  a  tissue  biopsy  of  a  cancerous  human  breast,  as 
well  as  a  group  of  normal  ducts  through  an  entire  mouse 
mammary  gland.  The  tissue  samples  were  sliced  for  a  total  of 
55  sections  in  the  human  tissue  and  206  sections  in  the  mouse 
one.  We  used  all  of  the  sections  for  the  reconstruction  of  the 
human  case  and  40  sections  for  reconstruction  of  the  mouse 
case.  Manually  delineating  all  the  structures  in  every  section 
is  an  extremely  time-consuming  process.  To  automatically 
segment  those  structures  using  the  framework  described  in 
Sec.  2,  we  begin  by  defining  a  region-of-interest  (ROI)  where 
we  will  run  the  segmentation  algorithm.  This  ROI  can  be 
extended  to  cover  the  entire  section,  but  considering  the  size 
of  the  images,  it  is  wise  to  use  subsampling  in  order  to  run  the 
algorithm  in  real  time.  The  level  of  subsampling  can  be  de¬ 
termined  by  the  user;  greater  subsampling  can  be  used  on 
large  ROIs  without  compromising  the  resolution  and  accuracy 
of  the  final  segmentation. 

After  defining  the  ROI,  we  have  to  tune  the  different  pa¬ 
rameters  of  the  segmentation  process,  particularly  a  [Eq.  (5)] 


(b) 


Fig.  5  The  segmentation  of  a  lymph  node  in  a  mouse  mammary  gland  section  is  shown  in  black,  (a)  The  result  of  the  fast-marching  method;  it 
provides  a  good  approximation  to  the  boundaries  of  the  lymph  node;  the  blue  point  in  the  middle  of  the  lymph  node  is  the  initial  contour  from 
which  the  algorithm  was  run.  (b)  The  result  of  using  the  level-set  method  after  the  fast-marching  method;  the  final  contour  is  more  accurate  and 
smoother.  In  both  cases  the  images  were  subsampled  in  the  x  and  y  directions  to  be  able  to  run  the  segmentation  in  real  time. 
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(a)  (b)  (C) 


Fig.  6  Segmentation  of  a  DCIS  tumor  in  human  mammary  gland  tissue,  (a)  Segmentation  using  just  the  level  set  method.  The  blue  dot  represents 
the  initial  contour,  (b)  Results  of  the  full  segmentation  with  blue  lines  delineating  tumor  masses,  (c)  Results  after  editing  using  the  interactive  tools 
provided  by  the  system. 


and  e  [Eq.  (7)].  This  is  done  by  the  user  based  on  the  default 
values  provided  by  the  algorithm  and  the  type  of  object  that 
he  or  she  is  trying  to  segment.  However,  we  have  observed 
that  a  particular  set  of  parameters  is  frequently  good  enough 
to  segment  similar  structures  (i.e.,  all  the  tumors,  all  the  ducts, 
all  the  lymph  nodes)  throughout  all  the  sections  of  a  particular 
tissue  block.  Thus  the  user  only  needs  to  modify  the  param¬ 
eters  the  first  time  that  he  or  she  tries  to  segment  a  new  type 
of  morphological  element  in  the  tissue. 

After  selecting  the  parameters  of  the  flow,  initial  points  are 
defined  inside  (to  find  the  internal  contour)  or  outside  (to  find 
the  external  contour)  the  structures  of  interest.  In  most  cases 
one  computer  mouse  click  is  enough,  although  large  images 
may  require  several  evenly  distributed  clicks.  The  value  of 
U( x)  at  these  points  is  set  to  zero  as  in  Eq.  (2),  and  the 
fast-marching  algorithm  of  Table  1  is  executed.  When  this 
method  finishes,  the  final  U(x)  function  is  passed  as  the  ini¬ 
tial  condition  to  the  level-set  motion  Eq.  (7).  We  then  iterate 
this  equation  for  a  few  steps.  This  segmentation  scheme  pro¬ 
vides  a  result  in  less  than  1  s  for  images  whose  size  (after 
subsampling,  if  any)  is  around  2  kbytes  (e.g.,  512X512  pix¬ 
els)  running  on  a  Sun  Ultra  10  workstation  with  1  Gbytes  of 
RAM. 

3.1  Human  Case  Segmentation 

Figure  6  displays  the  result  obtained  with  the  combination  of 
the  fast-marching  and  the  level-set  methods  in  a  tissue  biopsy 
from  a  patient  with  an  intraductal  carcinoma.  Figure  6(b) 


shows  the  segmentation  of  a  tumor  mass  in  a  human  tissue 
block.  The  results  of  the  segmentation  can  be  edited  and  re¬ 
moved  with  the  interactive  tools  provided  by  our  system  [see 
Fig.  6(c)].  For  this  segmentation,  an  area  was  selected  around 
the  structures  of  interest  and  no  subsampling  was  used.  The 
initial  contours  are  represented  by  blue  points  in  Fig.  6(a). 

3.2  Mouse  Case  Segmentation 

In  Fig.  7  we  can  see  the  segmentation  of  the  external  contours 
of  several  ducts  in  a  particular  area  of  a  mouse  mammary 
gland.  In  this  case  we  also  run  the  algorithm  on  an  ROI  on 
one  of  the  sections  with  no  subsampling  factor.  Figure  7(a) 
displays  the  points  where  we  initialized  the  contours.  Figures 
7(b)  and  7(c)  show  the  results  of  the  segmentation  before  and 
after  interactive  correction  of  the  results,  respectively. 

3.3  3-D  Reconstruction 

Finally  the  segmented  shapes  are  connected  (manually)  be¬ 
tween  sections,  and  3-D  reconstructions  of  the  samples  are 
built.  Figure  8  shows  a  reconstruction  of  the  tumors  contained 
in  the  human  tissue  block,  including  the  tumor  shown  in  Fig. 
6.  Increasing  the  “motion  by  curvature”  term,  as  described  in 
Sec.  2,  can  reduce  surface  noise.  In  Fig.  9  we  can  see  the 
reconstruction  of  the  normal  ducts  segmented  in  the  images  of 
the  mouse  mammary  gland  (see  one  of  the  corresponding  2-D 
segmentations  in  Fig.  7). 


Fig.  7  Segmentation  of  normal  ducts  in  a  mouse  mammary  gland  tissue  sample,  (a)  Initial  data,  (b)  Results  of  the  segmentation  (red  lines  delineate 
normal  ducts),  (c)  Results  after  editing  using  the  interactive  tools  provided  by  the  system. 
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Fig.  8  3-D  reconstruction  of  tumors  in  a  human  mammary  gland  tis¬ 
sue  block.  Tumor  masses  are  rendered  as  gray  volumes.  The  scene 
was  stretched  ten  times  in  the  z  direction  to  obtain  a  better  view. 


3.4  Further  Examples 

Figures  10  and  1 1  show  two  more  examples  of  the  results  that 
can  be  obtained  with  our  segmentation  approach.  In  both 
cases  the  initial  ROI  was  subsampled  by  a  factor  of  2  in  both 
the  x  and  y  directions.  The  same  segmentation  parameters  (a 
and  6)  were  used  for  both  examples. 

Figure  10(a)  shows  a  DOS  lesion  together  with  a  normal 
duct  in  a  human  tissue  section.  Both  structures  were  seg¬ 
mented  at  the  same  time  using  a  single  initial  point,  as  can  be 
seen  in  Fig.  10(b).  Figure  10(c)  shows  the  results  incorporated 
to  the  full  resolution  image. 

In  Fig.  1 1(a),  a  terminal  ductal  lobular  unit  (TDLU)  can  be 
observed.  These  are  lobuloalveolar  structures  where  milk  is 
produced  during  lactation  in  the  human  breast.  The  multiple 
alveoli  that  form  the  TDLU,  together  with  the  presence  of  a 
ductal  part,  make  automatic  segmentation  of  this  type  of 
structure  a  difficult  task.  However,  after  subsampling  the  im¬ 
age,  our  algorithm  is  able  to  find  a  contour  that  surrounds  the 
entire  structure  [Figs.  11(b)  and  11(c)],  thus  allowing  its  3-D 
reconstruction. 


Fig.  9  3-D  reconstruction  of  normal  ducts  in  a  mouse  mammary 
gland.  The  ducts  are  rendered  as  gray  volumes.  A  single  duct  and  its 
branches  can  be  traced  throughout  the  gland.  The  z  direction  was  not 
stretched  in  this  case. 


4  Discussion 

We  have  developed  a  microscopy  system  that  semiautomati- 
cally  reconstructs  histological  structures  from  fully  sectioned 
tissue  samples.  However,  the  interaction  required  to  operate 
the  system  is  quite  intensive,  limiting  the  scope  of  its  appli¬ 
cation  to  small  tissue  volumes  or  to  studies  not  requiring  high 
throughput  analysis  of  the  samples.  In  this  paper  we  have 
presented  a  method  that  reduces  the  time  and  interaction 
needed  to  build  the  3-D  reconstruction  of  a  tissue  block,  thus 
increasing  the  potential  throughput  of  the  system  and  there¬ 
fore  allowing  us  to  use  this  approach  for  the  reconstruction  of 
large,  complex  specimens.  To  achieve  this  goal  we  have  com¬ 
bined  image  processing  techniques  and  two  well-established 
schemes  for  interface  propagation:  the  fast-marching  method 
and  the  level-set  method. 

Our  approach  starts  by  correcting  the  background  of  the 
images.  This  is  an  important  step,  since  the  background  pat¬ 
tern  generated  during  image  acquisition  modifies  the  gradient 


(a)  (b)  (c) 

Fig.  10  Segmentation  of  a  different  DCIS  tumor  in  human  mammary  gland  tissue,  (a)  The  duct  on  the-left  contains  a  DCIS  lesion  with  a  necrotic 
center;  the  one  on  the  right  is  normal,  (b)  The  ROI  was  subsampled  by  a  factor  of  2  in  both  the  x  and  y  directions;  the  blue  dot  represents  the  initial 
seed,  (c)  Results  of  the  segmentation  on  the  full-resolution  image. 
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(a)  (b)  (c) 

Fig.  11  Segmentation  of  a  terminal  ductal  lobular  unit  in  a  human  mammary  gland,  (a)  The  duct  on  the  left  contains  a  section  of  a  TDLU,  one  of 
the  sites  of  milk  production  in  the  human  breast,  (b)  The  ROI  was  subsampled  by  a  factor  of  2  in  both  the  x  and  y  directions;  the  blue  dot  represents 
the  initial  seed,  (c)  Results  of  the  segmentation  on  the  full-resolution  image. 


of  the  image,  and  the  speed  function  that  we  use  for  interface 
propagation  depends  on  that  gradient.  Once  the  background 
has  been  corrected,  we  run  the  fast-marching  method.  This 
technique  provides  a  good  approximation  of  the  boundaries  of 
the  objects  that  we  are  trying  to  segment  in  a  very  short  time, 
since  it  assumes  monotonic  speed  functions  (always  positive 
or  always  negative).  We  then  use  the  approximation  provided 
by  the  fast-marching  method  as  the  initial  condition  for  the 
level-set  method.  This  more  computationally  expensive  algo¬ 
rithm  is  run  for  just  a  few  steps,  enough  to  fit  the  front  to  the 
contours  of  the  structures  of  interest,  but  not  as  many  as  to 
make  the  segmentation  too  time-consuming. 

Though  this  approach  is  very  useful,  it  can  still  be  im¬ 
proved.  The  most  accurate  segmentations  (and  reconstruc¬ 
tions)  are  obtained  on  full-resolution  images.  However,  for 
some  large  structures,  such  as  lymph  nodes,  or  when  trying  to 
delineate  multiple  elements  at  the  same  time  (for  example,  a 
group  of  ducts),  segmentation  on  a  full-resolution  image  is  not 
real  time,  and  can  take  up  to  1  min.  Using  subsampling  takes 
the  segmentation  execution  time  back  to  real  time  at  the  ex¬ 
pense  of  some  accuracy  loss.  Also,  in  areas  with  a  lot  of 
texture  in  the  tissue  around  the  ducts  (stroma),  tuning  the 
parameters  of  the  algorithm  (a  and  6)  becomes  more  difficult. 
At  times  this  process  can  take  a  few  trials,  since  the  expand¬ 
ing  front  tends  to  get  trapped  in  high  gradients  that  do  not 
correspond  to  the  boundaries  of  the  feature  that  we  are  trying 
to  segment,  but  to  the  texture  of  the  stroma.  Finally,  once  all 
the  structures  of  interest  have  been  segmented,  the  user  still 
needs  to  manually  connect  them  between  sections.  This  con¬ 
stitutes  a  new  bottleneck  in  the  tissue  analysis  process. 

For  these  reasons,  we  are  currently  working  on  a  time  step- 
independent  scheme  that  is  expected  to  be  faster  than  the  cur¬ 
rent  one.  To  improve  the  accuracy  of  the  results,  we  have 
developed  an  edge-preserving  smoothing  algorithm  based  on 
the  Beltrami  flow,lg  which  can  replace  the  Gaussian  smooth¬ 
ing  currently  used  before  executing  the  segmentation  meth¬ 
ods.  This  algorithm  eliminates  false  gradients  that  are  due  to 
noise,  while  enhancing  gradients  that  are  due  to  the  object’s 
boundaries,  thus  allowing  the  front  to  fit  the  boundaries  of  the 


object  more  accurately.  Finally,  the  level-set  approach  is 
readily  extensible  to  3-D.  The  ability  to  segment  3-D  struc¬ 
tures  of  interest  versus  2-D  ones  would  save  the  process  of 
connecting  the  segmented  2-D  contours  from  section  to  sec¬ 
tion,  thus  improving  the  analysis  time.  Also,  since  geometric 
properties  can  be  easily  extracted  from  the  higher  dimensional 
function  used  in  the  level-set  algorithm,  we  could  readily  ob¬ 
tain  some  information  about  the  extracted  volume  from  the 
segmentation  algorithm  itself.  We  are  also  working  on  a  fully 
interactive  segmentation  method  based  on  the  model  of  the 
intelligent  scissors,21  because  the  selection  of  the  seed  points 
for  segmentation  is  something  that  cannot  be  automated,  ow¬ 
ing  to  the  high  variability  in  the  shapes  of  object’s  to  be  seg¬ 
mented. 

In  conclusion,  we  have  presented  a  real-time  method  for 
automatic  segmentation  of  morphological  structures  in  mam¬ 
mary  gland  tissue  sections.  It  is  precisely  the  delineation  of 
those  structures  that  required  the  heaviest  user  interaction  in 
our  sample  analysis  protocol.  Therefore,  the  automatic  ap¬ 
proach  to  segmentation  that  we  describe  here  represents  a  first 
step  toward  real-time  reconstruction  and  analysis  of  mam¬ 
mary  gland  samples.  Achieving  that  goal  would  allow  us  to 
accelerate  our  studies  on  the  biological  basis  of  human  breast 
cancer.  Moreover,  obtaining  real-time  reconstruction  and 
analysis  of  samples  from  our  system  would  be  useful  for 
pathological  diagnosis  in  a  clinical  environment;  3-D  render¬ 
ings  of  all  the  morphological  structures  in  a  mammary  gland 
biopsy  could  be  mapped  with  the  distribution  of  particular 
markers  of  breast  cancer  within  a  few  hours  of  extracting  the 
tissue  from  the  patient.  From  an  intrasurgical  point  of  view, 
the  renderings  would  prove — tissue  processing  and  stain 
permitting — an  important  tool  in  the  evaluation  of  breast  tu¬ 
mors  and  their  margins. 
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Abstract — We  present  two  new  methods  for  automatic 
registration  of  microscope  images  of  consecutive  tissue  sections. 
Both  methods  are  based  on  the  images  gradient  correlation, 
but  the  first  one  makes  use  of  the  entire  image  information 
whereas  the  second  one  starts  from  the  segmentation  result. 
They  represent  two  possibilities  for  the  first  step  in  the  3-D 
reconstruction  of  histological  structures  from  serially  sectioned 
tissue  blocks.  The  aim  lies  in  aligning  the  sections  in  order  to 
place  every  relevant  shape  contained  in  each  image  in  front  of 
its  corresponding  shape  in  the  following  section.  This  is 
accomplished  by  finding  the  best  rigid  body  transformation 
(translation  and  rotation)  of  the  image  being  registered,  by 
maximizing  a  matching  function.  To  reduce  computing  time, 
we  use  a  multiresolution  pyramidal  approach  that  seeks  the 
best  registration  transformation  in  increasing  resolution  steps. 
In  each  step,  a  subsampled  version  of  the  images  is  used.  The 
gradient  of  the  images  is  computed  using  a  Sobel  operator. 
Then,  the  gradient  image  is  binarized  using  an  automatic 
threshold  and  the  distance-transform  of  the  binary  image  is 
computed.  A  proximity  function  is  then  calculated  between  the 
distance  image  of  the  image  being  registered  and  that  of  the 
reference  image.  The  transformation  providing  a  maximum  of 
the  proximity  function  is  then  used  as  the  starting  point  of  the 
following  step.  This  is  iterated  until  the  error  is  below  a 
minimum  value. 

Keywords — Automatic  registration.  Image  processing, 
Biotechnology 

I.  Introduction 

A  correct  visualization  of  the  morphology  and 
functionality  of  the  mammary  gland,  as  similar  as  possible 
to  the  specimen  living  conditions,  is  basic  in  the  study  of 
normal  mammary  gland  biology  and  its  neoplastic  variants 
(i.e.  breast  cancer).  With  this  aim,  we  reconstruct  mammary 
gland  epithelial  structures  from  fully  sectioned  paraffin 
tissue  blocks,  after  it  is  stained  with  histological  and  protein 
or  genetic  markers.  Our  lab  has  developed  a  software 
microscopy  system  [1]  that  handles  a  microscope,  scans 
sections  of  mammary  gland  tissues,  and  processes  these 
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images  and  reconstructs  the  morphology  of  the  gland  using 
the  section  information.  As  a  starting  point  for  the  3-D 
reconstruction  of  the  histological  structures,  the  system 
employs  the  section  contours  found  by  using  our  2-D 
segmentation  tool.  However,  as  a  first  step  towards  an 
accurate  3-D  reconstruction,  a  proper  alignment  of  the 
images  of  sections  is  needed,  in  order  to  place  every  relevant 
2-D  structure  in  one  section  in  front  of  its  correspondent  2-D 
structure  in  the  following  section. 

A  perfect  image  alignment,  also  called  registration, 
appears  impossible  due  to  human  interaction  in  the  section 
creation  process  (manual  sectioning  can  cause  non-linear 
distorting  effects,  such  as  deformations,  folds,  tears  or  cuts 
in  the  tissue),  as  well  as  to  the  natural  differences  between 
sections  caused  by  the  cut  distance. 

This  first  step  of  the  registration  process  aims  at 
achieving  the  best  rigid  body-transform  for  each  pair  of 
images  for  consecutive  sections.  This  can  be  easily  reached 
with  a  manual  process,  marking  alternatively  three  points  in 
every  image  and  calculating  the  rigid  body-transform  (i.e. 
rotation  plus  translation)  that  minimizes  the  lineal  quadratic 
error  between  every  pair  of  points.  However,  manual 
registration  of  hundred  of  sections,  as  required  when 
working  with  complete  cases,  could  become  very  slow  and 
tedious. 

Although  a  perfect  registration  is  not  reachable,  an 
accurate  approach  is  usually  enough,  either  for  consolidating 
the  tri-dimensional  reconstruction,  or  for  obtaining  a  base 
for  following  non-rigid  registrations.  Therefore,  a  rigid 
affine  transformation  will  allow  us  to  solve  the  problem  of 
the  registration.  While  this  paper  explains  the  solution  for 
our  mammary  gland  tissue  sections  alignment,  it  can  be 
applied  to  many  other  different  kind  of  images. 


II.  Methodology 

Hematoxylin  and  Eosin  were  used  to  stain  5  micron 
serial  sections  from  paraffin  tissue  blocks.  The  tissue  was 
either  normal  mouse  mammary  gland  or  human  tissue  from 
biopsies  diagnosed  with  ductal  carcinoma  in  situ  in  the 
breast  (DCIS).  Every  section  image  was  taken  in  grayscale 
at  low  resolution  (2.5X  magnifications),  and  registered  with 
the  previous  one  in  the  case  following  the  algorithm  exposed 
in  the  next  paragraphs. 

As  in  all  automatic  registration  methods,  this  system 
requires  that  the  matching  function  addressing  the  algorithm 
measures  the  degree  of  similarity  between  the  image  to  be 
aligned  (or  target  image)  with  respect  to  the  reference 


1 


image,  in  order  to  achieve  the  best  transformation.  The 
criterion  selected  here  is  the  Hierarchical  Chamfer 
Matching  Algorithm  for  registering  [2]. 

In  this  method,  a  contour  is  projected  in  different 
positions  on  a  distance  image.  The  reference  image  is 
transformed  in  a  distance  image,  meaning  that  the  edges  of 
the  structures  appearing  in  the  image  receive  a  value  of 
white  (the  highest  one)  and  the  rest  of  the  pixels  present 
gradually  smaller  values  depending  on  the  distance  to  the 
edges  (whiter  as  closer  to  the  edges).  Once  the  distance 
image  is  calculated,  the  system  attempts  to  match  it  with  the 
contour  image,  a  binary  image  obtained  by  applying  an 
automatic  threshold  function  over  the  gradient  of  the  image 
to  be  aligned.  Before  going  on  with  the  matching  function, 
the  contour  image  is  transformed  with  three  degrees  of 
freedom:  translations  in  the  in  x-  and  y-  directions  and 
rotation  around  the  x-axis.  Since  the  proximity  function  is 
calculated  as  the  sum  of  the  scalar  product  (pixel  by  pixel) 
of  the  contour,  the  distance  images  and  the  edges  presenting 
white  values,  a  perfect  matching  between  images  should 
then  provide  the  biggest  value  as  function  result.  Therefore, 
when  this  sum  is  maximized,  the  best  transformation  -  and 
consequently  alignment  -  is  achieved. 

Considering  the  huge  size  of  the  images  being 
processed  -  usually  around  6000x6000  pixels  and  35 
Megabytes  each  -  the  system  will  proceed  over  the  highest 
resolution  images.  Therefore,  the  first  step  in  the  algorithm 
must  be  a  subsampling  of  both  target  and  reference  images, 
which  will  produce  an  important  reduction  in  the  dimension 
of  the  images,  and  consequently  in  the  computational  time. 
The  search  for  the  local  maxima  starts  in  two  reduced 
images  generated  from  the  original  ones  with  as  low  a 
resolution  as  possible.  The  best  match(ing)  between  the 
images  applying  all  possible  rotations  and  translations  to  the 
target  will  be  calculated.  Namely,  the  system  uses  all 
possible  rotations  from  0°  to  350°  in  steps  of  10°  and  all 
possible  translations  in  different  steps  depending  on  the 
image  size  and  guaranteeing  at  least  50%  of  image 
overlapping.  This  way  this  brute-force  first  level  allows  us 
to  find  an  initial  registration  that  can  be  used  as  a  starting 
point  for  the  following  step,  where  the  rank  of  translations 
and  rotations  will  be  reduced  to  the  neighborhood  of  the 
previous  result,  and  rotation  and  translation  steps  decreased 
in  order  to  better  refine  the  registration.  So  the  problem  is 
approximated  step  by  step,  and  the  result  of  every  step  is  the 
starting  point  for  the  following  one.  That  means  that  the 
algorithm  presents  a  pyramidal  organization,  and  uses 
different  degrees  of  resolution  and  variation  ranks  in  the 
rigid  transformations  at  every  pyramid  level.  Usually,  the 
bottom  of  the  pyramid  is  the  less  subsampled  image  (if  the 
scale  factor  is  2,  one  pixel  in  level  n  corresponds  to  four 
pixels  in  level  n  -  I). 

Since  the  system  decreases  gradually  the  subsampling 
factor  in  every  level,  lower  levels  involve  more  image 
information,  providing  more  accurate  results.  A  complete 
description  of  one  level  process  is  illustrated  in  Fig.  1. 


Fig.  I:  Flow  chart  describing  the  level  process.  The  three  first  steps 
(subsampling  with  blur,  gradient  and  threshold)  are  equal  for  both  images. 
At  the  fourth  step  the  target  image  suffers  the  rigid  body  transformation  and 
the  reference  is  converted  in  a  distance  image.  Same  steps  are  applied  in 
every  level  modifying  only  the  subsampling  parameters  and  the  ranks  in  the 
affine  transformation. 

The  second  registration  method  proposed  in  this  paper  is 
a  natural  extension  of  the  method  already  exposed.  Taking 
advantage  of  the  automatic  segmentation  tool  developed  in 
our  lab  [3],  a  new  algorithm  can  be  built  starting  from  the 
segmentation  results  (Fig.  2).  The  segmentation  produces  in 
every  image  a  list  of  contours,  which  can  be  reused  in  order 
to  create  a  new  binary  image  (black  background  and  white 
contours).  It  seems  obvious  that  this  binary  image  could 
substitute  the  thresholded  images  appearing  in  the  previous 
method,  improving  the  final  result,  given  the  fact  that  the 
contours  obtained  in  this  segmentation  process  resemble 
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Fig.  2:  Example  of  mice  breast  mammary  gland  section  segmented 
automatically  by  our  lab  tool  [3].  The  contours  in  red  represent  the  limits  of 
relevant  shapes,  such  as  ducts,  tumors  or  lymph  nodes. 


more  accurately  the  section  information  needed  by  the 
expert. 

Notice  here  that  the  frequent  user  of  the  segmentation 
tool  is  a  mammary  gland  biologist,  whose  experience  allows 
setting  up  the  tool  parameters  in  order  to  distinguish  the 
noise  from  the  relevant  information,  an  experience  that  was 
not  supplied  to  the  previous  automatic  registration  method 
(let  us  call  it  standard  method).  Another  advantage  in  this 
second  method  (let  us  call  it  shape  registration)  is  the 
reduction  in  the  complexity  of  the  image  preprocess,  i.e. 
only  the  two  contour  images  need  to  be  created  (considering 
the  subsampling  factor  of  the  correspondent  pyramid  level) 
and  then  the  distance  transform  of  the  contour  image 
corresponding  to  the  reference  image  is  calculated.  Next,  the 
normal  rigid  transformation  and  matching  function  will  be 
applied  through  the  different  levels  until  getting  to  the 
optimum  registration  values  (Fig.  3). 


III.  Results 

Once  the  correct  operation  of  the  system  was 
demonstrated  experimentally  for  artificial  cases,  that  is, 
cases  with  the  same  section  rotated  and  translated,  the 
method  was  applied  to  real  mammary  gland  sections  with 
fair  results.  One  way  of  displaying  the  performance  of  the 
registration  algorithm  consists  of  creating  a  color  image 
combining  both  target  and  reference  images,  representing 
the  original  reference  image  in  red,  and  the  target  image, 
after  being  transformed  according  to  the  parameters 
obtained,  in  green.  Overlapped  structures  should  then  appear 
in  yellow,  as  it  corresponds  to  the  sum  of  green  and  red. 
Therefore,  a  perfectly  aligned  pair  of  images  will  present 
most  of  the  areas  in  yellow  and  an  incorrectly  aligned  pairs 
will  show  areas  with  intermediate  colors  (green  towards 
red).  Fig.  4  shows  in  this  way  the  difference  in  accuracy 
between  two  consecutive  system  levels  and  its  implicit  error 
minimization. 

Running  the  application  on  a  PC  Pentium  IV  (2.66  GHz, 
with  512  Megabytes  of  RAM  memory)  under  Linux,  the 
standard  method  of  automatic  registration  on  two  typical 
sections  (around  35  Megabytes  each)  is  accomplished  in  4 
or  5  minutes  using  3  different  levels  of  resolution.  Due  to 
less  process  charge,  the  second  method  exposed  or  shape 
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Fig.  3:  flow  chart  describing  one  level  of  the  registration  process  for  the 
“shape  registration”  method.  Extracting  the  segmentation  information  from 
the  original  images  and  rescaling  the  contours  depending  on  the 
subsampling  factor,  the  binary'  images  are  built.  As  before,  the  reference  is 
transformed  in  a  distance  image  and  the  target  is  rotated  and  translated 
before  applying  the  matching  function.  For  this  example,  same  image  as 
used  in  Fig.  2  was  processed  as  reference  image. 

registration,  is  even  faster  than  the  previous  one  and  only 
takes  around  2  minutes  for  the  same  pair  of  sections. 

Usually  two  pyramid  levels  achieve  the  optimum 
registration  parameters.  For  more  complex  images,  three 
levels  are  strongly  recommended. 


Fig.  4:  Complete  result  and  detailed  zoom  of  the  first  and  second  level 
(respectively)  in  the  standard  automatic  registration  algorithm.  Yellow 
areas  mean  structure  overlapping  and  red  and  green  areas  represent 
incorrect  alignment.  These  color  pictures  allow  us  to  visualize  the  error 
reduction  evolution. 
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IV.  Discussion 


As  stated  at  the  beginning  of  the  paper,  consecutive 
sections  could  present  non-rigid  deformations  due  to  human 
interaction  in  the  section  elaboration  process.  Therefore,  in 
these  specific  cases,  an  affine  transformation  could  be 
insufficient  for  a  perfect  alignment  and  either  local 
corrections  in  the  rigid  registration  result  or  smooth  filtering 
over  the  volume  after  the  tri-dimensional  reconstruction 
would  then  provide  a  more  accurate  approximation  to  the 
problem.  Regardless,  the  rigid  registration  supplies  a 
suitable  result  in  most  of  our  cases,  and  can  be  the  first  step 
towards  a  fully  non-linear  elastic  registration. 

The  systematic  search  followed  by  our  method,  even  if 
it  is  improved  with  a  50%  of  forced  image  overlapping  and 
different  rotations  and  translations  steps  depending  on  the 
image  size  and  pyramid  level,  could  be  reviewed  in  order  to 
find  an  optimization  method  that  allows  us  to  leave  our 
current  brute  force  method  and  decreases  the  computational 
time  taken  by  the  system. 


V.  Conclusion 

We  described  a  fully  automatic  algorithm  for  the 
registration  of  microscopy  images  of  consecutive  tissue 
sections,  and  a  variant  of  the  same  method  based  on  the 
previous  segmentation  of  the  images.  The  system  is  based 
on  a  multiresolution  and  pyramidal  approach  in  order  to 
calculate  the  optimum  rigid  transformation  between 
Hematoxylin  and  Eosin  (H&E)  pairs  of  sections. 
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Abstract — There  are  many  problems  in  mammary  gland 
biology  where  the  spatial  distribution  of  the  cells  may  be 
playing  a  fundamental  role.  Thus,  a  tool  that  can  quantitatively 
measure  that  distribution  would  be  extremely  helpful  in  order 
to  solve  some  of  the  previously  mentioned  problems.  Here  we 
present  a  method  for  the  spatial  analysis  of  mammary 
epithelium  based  on  a  multiscale  study  of  neighborhood 
relationships.  A  function  to  measure  those  relationships,  A/,  is 
introduced.  The  refined  Relative  Neighborhood  Graph  is  then 
presented  as  a  method  to  establish  vicinity  relationships 
between  epithelial  cell  nuclei  in  the  mammary  gland.  Finally, 
the  method  is  illustrated  with  two  examples  that  show 
interactions  within  one  population  of  epithelial  cells  and 
between  two  different  populations. 

Keywords — Mammary  gland,  quantitative,  spatial 

distribution. 

I.  Introduction 

The  spatial  distribution  of  cells  within  epithelial  tissues 
plays  a  fundamental  role  in  development,  function  and 
regeneration  of  these  tissues  [1,  2].  For  example,  in 
mammary  gland  development,  cap  cells  with  invasive 
properties  line  the  surface  of  the  growing  ducts,  which 
extend  through  the  fat  pad  that  embeds  the  gland.  In  the 
alveolar  units  that  cap  the  ducts  of  a  mature  gland,  secretory 
epithelial  cells  line  the  lumen  of  the  ducts.  After  pregnancy, 
this  mature  luminal  epithelium  secretes  milk  into  the  ducts. 
Myoepithelial  cells  are  arranged  around  the  ductal  tree,  as 
well  as  around  the  terminal  alveolar  units.  Upon,  the 
appropriate  stimulus  they  contract,  thus  forcing  the  milk 
through  the  ducts  towards  the  nipple  [3]. 

Many  of  these  spatial  phenomena  have  been  previously 
described  in  a  qualitative  way  but,  due  to  the  lack  of 
appropriate  tools,  most  of  them  have  not  been  quantitatively 
studied.  For  example,  the  colocation  in  the  patterns  of 
expression  of  the  estrogen  (ER)  and  the  progesterone  (PR) 
receptors  in  luminal  epithelial  cells  [4];  the  fact  that 
proliferation  markers  are  not  expressed  by  ER+  cells  [5];  or 
the  possible  presence  of  a  niche  -a  highly  organized  pattern 
of  different  cell  types-  around  mammary  stem  cells  [6]  have 
never  been  assessed  from  a  quantitative,  spatial  point  of 
view. 

In  order  to  address  these  questions,  we  have  developed  a 
quantitative  spatial  analysis  tool  that  we  have  integrated  into 
our  3D  microscopy  system  [7].  In  this  paper  we  introduce 
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that  tool  and  show  two  examples  obtained  on  real  data,  thus 
demonstrating  how  we  will  use  it  to  address  some  of  the 
questions  mentioned  above. 

II.  Methodology 

A.  Tissue  processing 

Mammary  gland  tissue  blocks  are  sectioned  at  5  pm, 
and  the  sections  are  immunostained  for  the  appropriate 
antigens.  A  counterstain  is  used  that  allows  us  to  study  the 
morphology  of  the  tissue  (e.g.:  DAPI).  Low  magnification 
(2.5X)  images  of  the  counterstain ing  of  all  the  sections  are 
then  automatically  acquired  using  a  motorized  Zeiss 
Axioplan  I  microscope  coupled  with  a  monochrome  XilliX 
Microimager  CCD  camera.  This  is  done  automatically  by 
scanning  the  area  of  the  slide  occupied  by  the  tissue  and 
tiling  together  all  the  individual  snapshots  into  a  single 
image  of  the  entire  section. 

The  next  step  consists  of  automatically  annotating  the 
structures  of  interest  (ducts,  tumors,  ...)  in  these  low 
magnification  images  [8].  These  annotated  structures  are 
then  used  to  reconstruct  a  three-dimensional  model  of  the 
sample.  With  this  model  we  can  track  the  morphology  of  the 
tissue  to  determine  which  are  the  areas  where  a  spatial 
analysis  might  be  more  interesting.  After  selecting  these 
areas,  the  system  asks  the  user  to  place  the  right  fluorescent 
section(-s)  on  the  stage,  and  high  magnification  (40X) 
images  of  the  immunostaining  of  the  chosen  areas  are 
acquired.  In  these  images  nuclei  are  manually  annotated 
with  dots  and  visually  classified,  forming  a  point  pattern; 
they  can  also  be  automatically  segmented  and  quantified  [9]. 
In  the  later  case,  the  center  of  mass  of  each  nucleus  is 
computed  to  obtain  a  point  pattern  of  nuclei  markings. 

B.  M  function  analysis 
B.l.  Definitions 

Given  a  set  of  points  {nu  ...,  wNc}  representing  the  nuclei 
belonging  to  population  C  in  the  study  area,  we  define  niCr> 
the  number  of  neighbors  of  nucleus  n,  belonging  to 
population  C  within  distance  r;  nir>  the  total  number  of 
neighbors  of  n,  (belonging  to  any  population)  within  distance 
r;  Nc,  the  total  number  of  nuclei  belonging  to  population  C 
within  the  area  under  study;  and  N,  the  total  number  of 
nuclei  in  that  same  area  (Fig.  1). 

B.2.  Single-variable  analysis 

The  distribution  of  epithelial  cells  across  mammary 
tissue  is  not  homogeneous:  they  can  only  be  located  at 
ducts,  end  buds  and  alveolar  structures,  but  never  in  the  fat 
pad  surrounding  the  gland,  which  also  shows  up  in  the 
images.  For  that  reason,  we  cannot  do  our  spatial  analysis 
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with  Ripley's  K  function  [10],  which  has  traditionally  been 
used  in  other  fields  for  this  task.  We  now  assume  a  space 
with  heterogeneous  nuclei  density.  In  this  space,  the  total 
number  of  nuclei  on  an  area  measures  the  size  of  the  set  of 
possible  locations  for  an  epithelial  nucleus  in  that  area:  any 
of  those  points  could  be  occupied  by  a  nucleus.  Thus,  we 
can  measure  the  density  of  nuclei  belonging  to  population  C 
as  a  ratio  of  nuclei  numbers,  and  so,  we  define  [11]: 

N  c  n 
y  "iCr 

&  nir  N  c 

M  r,C  (1) 


Nr 


N 


The  numerator  of  (1)  computes  the  average  density  of 
neighbors  belonging  to  population  C  within  distance  r,  and 
then  compares  that  value  to  a  benchmark:  the  density  of 
nuclei  belonging  to  population  C  in  the  entire  study  area. 
Therefore,  clustered  patterns  of  nuclei  will  have  Af(r,  C)  >  1, 
with  a  peak  at  the  cluster  size.  On  the  other  hand,  regular 
patterns  will  have  A/(r,  C)  <  1.  Finally,  random  distributions 
will  have  Af(r,  C)  =  1 .  In  general,  we  can  say  that  A/(r,  C)  = 
k  implies  that  the  density  of  nuclei  belonging  to  population 
C  within  distance  r  is  k  times  that  of  the  entire  area  under 
study. 

To  complete  the  univariate  analysis  we  need  to  have  a 

vay  to  establish  the 
significance  of  our 
neasurements.  For  that 
eason,  we  run  m  Monte 
Jarlo  simulations  of  the 
jiiuclei  distribution  within  the 
area  of  interest.  The 

ion  examples. 


simulations  are  set  up  by  preserving  the  nuclei  locations  and 
randomly  assigning  the  population  where  each  nucleus 
belongs.  We  compute  the  M  function  for  each  one  of  these 
simulations  (A/,(r,  Q,  /  =  1,  ...,  m)  and  calculate  U(r ,  C)  and 
L(r ,  O: 

U(r,  C»=  max  (M,(r ,C»  (2) 

/=1, ..  .  m 

L ( r ,  C)=  min  <M;(r,C))  (3) 

/ =1 m 

Now  we  can  plot  A/(r,  C),  U(r,  C)  and  L(r ,  C)  in  the  same 
graph.  Peaks  of  A/(r,  C)  above  U(r ,  C)  are  evidence  of 
significant  clustering  (with  confidence  level  a  =  \/(m  +  1)). 
Similarly,  troughs  below  L(r ,  Q  represent  significant 


regularity  or  dispersion.  Any  nuclei  distribution  with  no 
significant  peaks  or  troughs  can  be  considered  to  be  random. 


B.3.  Multiple-variable  analysis 

The  M  function  analysis  described  in  the  previous 
section  can  be  used  to  study  the  distribution  of  cells  within  a 
single  population.  However,  most  of  the  problems 
introduced  in  section  1  involve  two  or  more  cell  populations. 
In  order  to  study  this  type  of  problems,  we  can  modify  (1)  to 
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where  C\  and  C2  are  the  two  populations  under  study,  Nn 
and  N&  are  the  number  of  nuclei  in  each  one  of  those 
populations  and  nK1r  is  the  number  of  nuclei  belonging  to 
population  C2  within  distance  r  of  nucleus  n,  (with  nt 
belonging  to  C|).  It  is  easy  to  see  how  these  equation  could 
be  extended  to  three  or  more  populations. 


Now  M  values  larger  than  1  are  indicative  of  attraction 
between  populations  C\  and  C2  (a  special  case  of  this  is 
colocation ,  i.e.,  attraction  at  distance  r  =  0);  values  smaller 
than  1  indicate  repulsion;  and  A/(r,  C i,  C2)  =  1  shows 
independence  of  the  spatial  distributions  of  both  populations 
at  distance  r.  However,  significant  values  of  A/(r,  C|,  C2) 
maybe  due  either  to  actual  interactions  between  both 
populations  or  to  the  patterns  of  each  one  of  them.  For  this 
reason,  we  set  up  our  Monte  Carlo  simulations  preserving 
the  locations  of  the  nuclei  belonging  to  population  C x  and 
redistributing  the  location  of  population  C2.  Thus,  we 
control  for  the  C\  pattern.  The  same  process  is  applied  to 
M(r ,  C2,  Ci).  Finally,  significant  interaction  at  distance  r  is 
only  accepted  if  both  A/(r,  C|,  C2)  and  M(r ,  C2,  Ci)  are 
significantly  different  from  randomness. 


C.  Refined  RNG 

In  the  previous  section  we  have  used  the  shortest 
Euclidean  distance  to  measure  how  far  apart  two  nuclei  are. 
However,  this  is  not  the  best  way  to  represent  vicinity.  In 
fact,  the  shortest  Euclidean  distance  is  often  obtained 
through  luminal  areas  where  nuclei  cannot  be  located.  This 
is  in  contrast  with  cell-to-cell  signaling  in  the  epithelium, 
which  normally  occurs  through  intermediate  cells  [12]. 
Therefore,  we  decided  to  model  our  tissue  using  a  graph 
where  the  nodes  are  the  different  nuclei,  edges  represent 
neighborhood  relationships  and  distances  can  be  measured 
as  the  number  of  edges  between  two  nuclei. 
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We  start  out  by  building  a  Delaunay  triangulation  using 
the  nuclei  markings  as  nodes.  This  provides  a  preliminary 
tessellation  where  we  can  already  measure  distances  as 
number  of  edges.  On  top  of  this  triangulation  we  can  now 
build  a  Relative  Neighborhood  Graph  (RNG).  Here,  we 
preserve  an  edge  if  an  only  if  the  two  nuclei  on  its  sides  (n, 
and  nj)  are  relatively  close  [13],  that  is: 

d(nitnj)<  max(d(ni,nl(),d(nj)nk))  (5) 

v  k  =  ,k*  i  ,j 


For  the  second  example  we  obtained  tissue  from  a  wild 
type  mouse  which  had  been  given  a  constant  dose  of 
BromodeoxyUridine  (BrdU)  for  two  weeks.  BrdU  is  a 
thymidine  analog  that  gets  incorporated  into  the  DNA  of  the 
cells  that  undergo  mitosis.  The  tissue  was  sectioned,  and 
double  immunofluorescence  staining  was  carried  out  on  the 
sections.  BrdU  was  detected  using  a  secondary  antibody 


where  d(n„  nj)  is  the  length  of  the  edge  between  n,  and  nr  In 
other  words,  what  this  definition  states  is  that  an  edge  in  the 
Delaunay  triangulation  is  preserved  if  the  nuclei  on  its  sides 
are  at  least  as  close  to  each  other  as  they  are  to  any  other 
nucleus  in  the  graph.  With  this  we  obtain  the  RNG.  Finally, 
we  do  a  refinement  step  where  we  get  rid  of  all  the  edges 
which  are  too  large  and  we  force  connections  between 
nuclei  which  are  too  close  (using  the  shortest  Euclidean 
distance)  to  not  to  be  neighbors.  Now,  using  Floyd's  or 
Dijkstra's  algorithms  [14],  we  can  easily  build  a  table  with 
the  shortest  distance  (measured  as  the  number  of  edges) 
between  each  pair  of  nuclei  in  the  graph. 

III.  Results 

We  run  our  spatial  analysis  tool  on  a  set  of  synthetic 
images  to  test  for  its  accuracy  at  detecting  interactions  both 
within  and  between  populations.  Then  we  went  on  to  the 
analysis  of  real  tissue  samples.  In  this  section  we  describe 
two  different  examples.  For  the  first  one  the  tissue  was 
obtained  from  a  transgenic  mouse  overexpressing  the  HER2 
gene,  a  growth  factor  receptor  whose  human  counterpart  is 
overexpressed  in  about  30%  of  breast  cancers.  Sections  were 
taken  from  this  sample  and  immunostained  for  HER2  using 
diaminobenzidine  (brown  precipitate).  The  nuclei  were 
counterstained  with  hematoxylin  (blue).  High  magnification 
images  of  certain  areas  in  these  sections  were  acquired,  and 
the  nuclei  in  those  areas  were  manually  annotated.  Fig.  2 
shows  the  spatial  analysis  of  the  HER24  population  in  one  of 
those  areas  (inset).  Positive  nuclei  are  marked  with  red  dots, 
negative  ones  are  green.  The  analysis  indicates  the  presence 
of  clustering  at  small  distances  around  the  nuclei  (at  2  to  8 
nuclei  of  distance  as  measured  by  edges  in  the  refined 
RNG),  with  a  peak  at  distance  r  =  2.  This  peak  reveals  the 
presence  of  clusters  of  HER24  cells  with  a  radius  of  2  nuclei 
and  a  density  of  M=  1.55. 
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labeled  with  Alexa  568,  a  red  fluorochrome,  while  Alexa 
488  (green)  was  used  to  detect  ER4  cells.  Nuclei  were 
counterstained  with  DAPI  (blue).  Once  again,  high 
magnification  images  of  some  areas  were  taken,  and  nuclei 
were  manually  annotated.  Then,  multiple-variable  analysis 
of  the  interaction  between  ER4  and  BrdU4  cells  was  carried 
out.  Fig.  3  shows  an  example  of  this  type  of  analysis  (A/(r, 
ER4,  BrdU4)).  Nuclei  have  been  removed  from  the  inset 
image  for  clarity.  The  graph  for  A/(r,  BrdU4,  ER^)  is  very 
similar  to  the  one  shown  here.  Thus,  there  seems  to  be 
repulsion  between  both  populations  at  very  small  scales. 
Actually,  the  peak  of  this  repulsive  interaction  is  at  distance 
r  =  0  (M—  0.15),  i.e.,  ER4  and  BrdU4  cells  do  not  colocalize 
most  of  the  times,  but  do  colocalize  occasionally.  This 
result,  whose  intensity  and  extent  we  can  now  quantify 
using  the  M  values,  has  previously  been  described  in  the 
literature. 
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ng  spatial  phenomena  is  of  great  importance  in  biology  in 
general,  and  particularly  in  mammary  gland  studies.  In  order 
to  do  this  in  a  way  that  provides  consistency  and  high 
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throughput,  quantitative  tools  for  the  spatial  analysis  of 
samples  are  required.  In  this  paper  we  have  presented  a 
method  that  automatically  provides  a  measurement  of  the 
way  cells  interact  within  one  population,  as  well  as  of  the 
different  types  of  interaction  that  might  occur  between  the 
different  cell  populations  present  in  a  tissue. 

Our  approach  is  based  on  a  multiscale  analysis  of  the 
number  of  neighbors  belonging  to  the  population  under 
study,  followed  by  comparison  to  a  benchmark,  the  total 
density  of  nuclei  within  that  population  in  the  entire  study 
area.  Thus,  we  define  the  M  function,  which  takes  into 
account  the  heterogeneous  distribution  of  the  epithelium 
within  mammary  tissue.  This  function,  -together  with  the 
analysis  scheme  where  it  is  embedded-,  allows  for 
unsupervised  analysis  of  large  data  sets  in  a  reasonable  time, 
since  it  does  not  include  any  complex  calculation.  The 
method  provides  comparability  of  concentration 
measurements  across  populations;  remains  unbiased 
concerning  different  scales;  and  can  be  modified  depending 
on  the  desired  significance  level. 

In  order  to  define  neighborhood  in  a  way  that  takes  into 
account  the  histology  of  the  tissue  as  well  as  differences  in 
cell  size/image  magnification,  we  create  a  refined  RNG  that 
has  the  nuclei  markings  as  its  nodes.  The  connections  in  this 
graph  represent  vicinity  in  a  way  that  faithfully  depicts  what 
nuclei  might  be  directly  interacting  with  each  other  in  the 
tissue. 

In  the  near  future  we  are  planning  on  using  this  tool  to 
address  several  problems,  including  colocation/interaction 
studies  of  ERf,  PR'  and  HER2+  populations  in  both  wild 
type  and  transgenic  mice,  or  characterization  of  the 
distribution  of  label-retaining  cells  (a  population  of  cells 
likely  to  be  enriched  for  mammary  stem  cells)  with  respect 
to  other  populations  present  in  the  mammary  epithelium, 
thus  trying  to  unveil  the  presence  of  a  niche  around  stem 
cells  similar  to  the  ones  observed  in  other  organs.  Since  both 
of  these  problems  are  inherently  three-dimensional,  we  are 
currently  working  on  adding  one  more  dimension  to  our 
analysis  scheme.  This  extended  functionality,  together  with 
the  implementation  of  methods  taken  from  the  fields  of 
pattern  recognition  and  mathematical  morphology  on 
graphs,  should  help  us  explore  in  further  detail  the  possible 
determinants  of  interaction  both  within  and  between  cell 
populations. 


are  expected  will  greatly  contribute  to  the  detailed 
description  of  these  phenomena. 
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V.  Conclusion 

In  this  paper  we  have  presented  a  method  for  the  spatial 
analysis  of  mammary  epithelium.  Thus,  we  have  created  a 
tool  to  quantitatively  measure  what  previously  could  only  be 
qualitatively  described.  Our  multiscale  method  is  consistent, 
comparable  across  populations  and  allows  for  automatic, 
high-throughput  analysis  of  large  data  sets.  The  use  of  this 
approach  to  study  problems  where  interactions  between  cells 
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APPENDIX  5 


In  situ  analyses  of  genome  instability  in  breast  cancer 

Koei  Chin1,6,  Carlos  Ortiz  de  Solorzano2,6,  David  Knowles2,  Arthur  Jones3,  William  Chou2, 

Enrique  Garcia  Rodriguez2,  Wen-Lin  Kuo1,  Britt-Marie  Ljung4,  Karen  Chew1,  Kenneth  Myambo1, 

Monica  Miranda1,  Sheryl  Krig2,  James  Garbe2,  Martha  Stampfer2,  Paul  Yaswen2,  Joe  W  Gray1,2,6  & 
Stephen  J  Lockett2,5,6 


Transition  through  telomere  crisis  is  thought  to  be  a  crucial 
event  in  the  development  of  most  breast  carcinomas.  Our  goal 
in  this  study  was  to  determine  where  this  occurs  in  the  context 
of  histologically  defined  breast  cancer  progression.  To  this  end, 
we  assessed  genome  instability  (using  fluorescence  in  situ 
hybridization)  and  other  features  associated  with  telomere 
crisis  in  normal  ductal  epithelium,  usual  ductal  hyperplasia, 
ductal  carcinoma  in  situ  and  invasive  cancer.  We  modeled  this 
process  in  vitro  by  measuring  these  same  features  in  human 
mammary  epithelial  cell  cultures  during  ZNF21 7-media  ted 
transition  through  telomere  crisis  and  immortalization.  Taken 


together,  the  data  suggest  that  transition  through  telomere 
crisis  and  immortalization  in  breast  cancer  occurs  during 
progression  from  usual  ductal  hyperplasia  to  ductal 
carcinoma  in  situ. 

The  molecular  events  that  enable  normal  epithelial  cells  to  progress  to 
invasive,  metastatic  disease  are  increasingly  well  understood1,2.  Dereg¬ 
ulation  of  the  TP53  and  RBI  pathways  in  most  cancers  enables 
extended  proliferation.  In  breast  cancer,  deregulation  of  RBI  through 
inactivation  of  cyclin-dependent  kinase  inhibitor  2 A  ( CDKN2Ay  also 
called  p!6  and  INK4a)  seems  to  be  an  early  event3.  Most  epithelial  cells 


# 


Figure  1  2D  confocal  Y0-PR0-1  images  and 
bivariate  copy-number  histograms  of  chromosome 
lcen  and  20ql3.2  signals  in  3D  images.  Open 
bars  indicate  the  numbers  of  cells  with  copy 
numbers  specified  on  the  x  and  y  axes.  Filled 
bars  indicate  the  number  of  cells  with  two  copies 
each  of  lcen  and  20ql3.2.  (a)  A  2D  confocal 
Y0-PR0-1  image  of  normal  ductal  epithelium 
taken  midway  through  a  30-pm-thick  tissue 
section.  The  white  squares  (100  pm  x  100  pm) 
indicate  regions  for  which  3D  confocal  images 
were  acquired  for  copy-number  analysis,  (b)  A 
bivariate  copy-number  frequency  histogram  of  the 
number  of  copies  of  lcen  and  20ql3.2  in  the 
regions  indicated  in  a.  More  than  90%  of  nuclei 
had  two  copies  of  lcen  and  20ql3.2.  Statistical 
analysis  of  measurements  on  537  nuclei  from 
several  normal  specimens  indicated  that  93  ± 
4%  of  genuine  FISH  signals  were  detected  and 
that  there  was  a  4  ±  4%  chance  that  a  detected 
signal  was  spurious  (Supplementary  Table  1 
online),  (c)  A  2D  confocal  Y0-PR0-1  image  of  a 


UDH.  (d)  A  bivariate  frequency  histogram  of  lcen  and  20ql3.2  copy  numbers  measured  for  the  regions  indicated  in  c.  Most  cells  had  two  copies  of 
20ql3.2,  but  22%  of  cells  had  only  one  copy  of  lcen.  (e)  A  2D  confocal  Y0-PR0-1  image  of  a  DCIS  showing  an  expanded  duct  filled  with  heterogeneous 
tumor  cells,  (f)  A  bivariate  frequency  histogram  of  lcen  and  20ql3.2  copy  numbers  measured  for  the  regions  indicated  in  e  showing  substantial  genomic 
instability.  DNA  content  was  associated  with  genome  copy  number  measured  using  FISH  in  this  sample  (correlation  coefficient  =  0.83;  89  data  pairs). 
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lack  active  telomerase,  and  so  extended  proliferation  leads  to  telomere 
erosion  and,  eventually,  loss  of  telomere  function4.  These  cells  become 
genomically  unstable5  and  are  almost  always  eliminated  by  damage- 
surveillance  mechanisms5-7.  Rarely,  however,  one  or  a  few  cells  escape 
this  protective  mechanism  by  reactivating  telomerase7  and  develop 
additional  cancer  hallmarks2  through  accumulation  of  multiple  geno¬ 
mic  and  epigenomic  aberrations8.  Thus,  telomere  crisis,  defined  in  this 
paper  as  the  events  that  occur  when  cells  lose  telomere  function  as  a 
result  of  extended  proliferation  in  the  absence  of  telomerase,  is  a 
critical  rate- limiting  and  promoting  event1. 

Several  features  of  breast  cancer  suggest  that  telomere  crisis  occurs 
in  breast  cancer  during  transition  from  usual  ductal  hyperplasia 
(UDH)  to  ductal  carcinoma  in  situ  (DCIS).  First,  the  total  number 
of  aberrations  typically  is  much  higher  in  DOS  than  in  UDH9. 
This  increase  is  reminiscent  of  the  increase  in  genome  aberrations  in 
tumors  that  arise  in  late-generation  telomerase-knockout  mice1. 
Second,  progressive  shortening  of  telomeres10  and  reactivation 
of  telomerase  occurs  during  transition  from  UDH  to  DCIS11.  If 
transition  through  telomere  crisis  does  occur  during  transition  from 


Figure  2  A  2D  Y0-PR0-1  image  and  bivariate  copy  number  histograms  of 
chromosome  lcen  and  20ql3.2  signals  from  an  invasive  cancer.  Open  and 
filled  bars  are  as  defined  in  Figure  1.  (a)  A  2D  Y0-PR0-1  image  showing 
tumor  cells  invading  the  stroma  beyond  the  basement  membrane.  The  white 
squares  indicate  regions  for  which  3D  images  were  acquired  and  analyzed 
for  copy  number,  (b-d)  Bivariate  frequency  histograms  of  lcen  and  20ql3.2 
copy  numbers  measured  for  the  regions  indicated  in  a.  Some  regions  had 
near-normal  copy  numbers  (b),  but  others  had  high  copy-number  variability 
(c,d).  DNA  content  and  nuclear  volume  were  nearly  normal  for  the  cells  in  b 
and  increased  for  the  cells  in  c  and  d. 


UDH  to  DCIS,  genome  instability  should  increase  substantially  during 
this  transition.  We  assessed  this  possibility  by  measuring  genome 
instability  (using  fluorescence  in  situ  hybridization,  FISH),  DNA 
content  and  telomere  length  at  several  histologically  distinct  stages 
of  breast  cancer  progression.  We  compared  these  features  with  those 
observed  in  cultured  human  mammary  epithelial  cells  (HMECs) 
during  transition  through  ZNF21 7- mediated  telomere  crisis  and 
immortalization12. 

We  assessed  instability  by  three-dimensional  (3D)  confocal  micro¬ 
scopic  analysis  of  ~  30-|im-thick  tissue  sections  stained  for  copy 
number  using  dual-color  FISH13  with  probes  for  the  centromere  of 
chromosome  1  (lcen)  and  chromosome  20ql3.2  and  with  the  DNA- 
specific  dye  YO-PRO-1.  We  analyzed  three-color  (DNA  stain  and  two 
FISH  probe  labels),  3D  images  generated  using  confocal  microscopy  in 
regions  of  histological  interest  to  determine  the  boundaries  and 
volumes  of  intact  nuclei  and  to  enumerate  dual-color  FISH  signals 
therein  (Supplementary  Fig.  1  online).  We  used  the  standard  devia¬ 
tion  of  the  number  of  the  FISH  signals  (ctcn)  as  a  quantitative 
measure  of  genome  instability.  We  obtained  histological  information 
from  YO-PRO-1  images  and  confirmed  it  in  some  cases  by  micro¬ 
scopic  analysis  of  adjacent  thin  sections  stained  with  hematoxylin  and 
eosin  (Supplementary  Fig.  2  online). 

We  measured  genome  instability  and  DNA  content  in  normal 
human  skin,  normal  breast  epithelium  and  stroma,  three  UDH 
specimens,  three  DCIS  specimens  and  four  invasive  cancer  specimens 
(Figs.  1-3).  One  sample  of  normal  epithelium  had  low  instability 
(ctcn  =  ~°-5>  Fi8*  la>b)>  as  did  one  UDH  specimen  (gcn  =  ~0.6- 
0.7;  Fig.  lc,d).  Most  cells  had  two  copies  of  20ql3.2,  but  a  significant 
fraction  (22%,  P  <  0.001)  had  only  one  copy  of  lcen.  Spatial 


Figure  3  Genomic  events  measured  in  vivo  and 
in  immortalized  HMECs  using  dual-color  FISH 
and  array  CGH.  (a)  Standard  deviations  of  lcen 
and  20ql3.2  copy  numbers  measured  in  normal 
epithelium,  UDH,  DCIS  and  invasive  cancer  (1C). 
Instability  was  highest  in  DCIS  and  high  in 
invasive  cancer,  but  somewhat  lower  than  in 
DCIS.  (b)  Genome  copy-number  distributions 
measured  using  array  CGH  in  ZNF217- 
immortalized  HMECs12.  log2  relative  copy 
number  (y  axis)  is  plotted  as  a  function  of 
distance  along  the  genome  (xaxis;  lpter  on  the 
left  and  chromosomes  22qter  and  X  to  the  right). 
Vertical  lines  indicate  chromosome  boundaries. 
Telomerase  activity  and  telomere  lengths 
measured  in  earlier  studies12  are  indicated  to  the 
right  of  the  array  CGH  profiles,  (c)  Standard 
deviations  of  lcen  and  20ql3.2  copy  numbers 
measured  using  FISH  for  several  passages  of 
ZNF21 7-immortal ized  HMECs.  Similar  data  was 
obtained  for  chromosomes  2p24  and  2q36 
(data  not  shown). 
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Figure  4  Anaphase  bridges  in  human  breast  tumor  specimens  and  nuclear  bridges  in  ZNF21 7- immortalized  HMECs.  (a)  Anaphase  bridges  in  an  atypical 
ductal  hyperplasia  specimen.  Scale  bar,  10  urn.  (b)  Left,  nuclear  bridges  in  DAPI-stained  preparations  of  ZNF2 17- immortalized  HMECs.  Right,  uneven 
partitioning  at  p22  of  loci  on  chromosome  2  as  indicated  by  FISH  with  probes  to  MYCN  (2p24;  green)  and  INDPP5D  (2q36;  red).  Scale  bars,  2.5  pm. 
(c)  Nuclear  bridge  frequencies  scored  in  300  nuclei  at  several  passages  of  ZNF21 7- immortalized  HMECs. 


statistical  analyses  showed  that  the  nuclei  with  one  and  two  copies  of 
lcen  were  randomly  mixed.  The  nuclear  volumes  in  all  regions  of 
UDH  were  similar  to  those  measured  for  normal  epithelial  cells.  One 
DCIS  specimen  had  significantly  higher  instability  (aCN  =  ~  1-1.5; 
P  <  0.01,  Wilcoxon  rank  test;  Fig.  le,f).  Cells  in  this  sample  varied 
substantially  in  genome  composition  and  DNA  content,  and  the 
number  of  copies  of  lcen  plus  20ql3.2  was  strongly  correlated  with 
DNA  content  (correlation  coefficient  =  0.83),  indicating  that  the 
increase  in  DNA  content  was  contemporaneous  with  the  increase  in 
genome  instability.  Spatial  analyses  of  this  and  another  specimen 
showed  that  cells  with  widely  different  chromosomal  compositions 
were  randomly  mixed,  indicating  that  most  copy-number  variation 
was  due  to  high  instability  rather  than  clonal  evolution.  A  third  DCIS 
specimen  had  a  near- normal  level  of  instability,  suggesting  that  it 
might  be  less  for  along  in  transition  through  telomere  crisis.  Increased 
20ql3.2  copy  number  was  the  dominant  feature;  instability  was 
low  and  the  nuclear  volume  was  near  normal.  One  invasive  cancer 
sample  had  high  instability,  but  lower  than  that  of  the  DCIS  specimens 
(aCN  =  —  0.67—1.28;  Fig.  2),  although  the  difference  was  not  sig¬ 
nificant  (P  <  0.38,  Wilcoxon  rank  test).  Some  areas  (Fig.  2b)  showed 
near- normal  copy  number  and  volume,  whereas  others  (Fig.  2c,d) 
were  highly  variable  in  copy  number  and  DNA  content.  Cells  with 
widely  different  chromosomal  composition  were  randomly  mixed  in 
most  invasive  cancers,  although  some  diploid  cells  were  spatially 
segregated  from  cells  with  genomic  abnormalities. 


We  scored  mitoses  and  anaphase  bridges  in  thin  sections  stained 
with  hematoxylin  and  eosin  in  UDH  and  DCIS  cases  adjacent  to 
sections  that  were  analyzed  using  thick-section  FISH  and  in  an 
additional  8  UDH  cases,  12  atypical  ductal  hyperplasias  and  11 
DCIS  specimens.  Anaphase  bridges,  considered  a  hallmark  of  telomere 
crisis7,  were  present  in  two  atypical  ductal  hyperplasias  and  two  DCIS 
specimens  (Fig.  4a). 

We  assessed  telomere  length  using  FISH  as  described10,  except  that 
we  used  dual-color  FISH  with  probes  to  the  centromeres  and 
telomeres  so  that  telomere  length  could  be  normalized  to  the 
centromeric  hybridization  intensity.  The  average  telomere  length  was 
significantly  shorter  in  the  DCIS  lesions  than  in  normal  tissue  (P  < 
0.03;  Fig.  5a),  as  expected,  and  was  significantly  shorter  in  invasive 
cancers  than  in  DCIS  (P  <  0.01),  even  though  telomerase  had  been 
reactivated.  This  is  probably  because  telomere-length  control  acts  in 
ris  at  each  individual  chromosome  end,  so  that  the  critically  short 
telomeres  are  stabilized  but  the  others  continue  to  erode4. 

We  compared  the  genomic  events  that  occurred  during  progression 
from  UDH  to  invasive  cancer  with  those  that  occurred  in  cultures  of 
HMECs  before,  during  and  after  ZNF217- mediated  immortaliza¬ 
tion12.  In  this  model,  ZNF217  transduction  of  HMECs  with  a  finite 
lifespan  allowed  one  or  a  few  cells  in  each  culture  to  activate 
telomerase,  stabilize  telomere  length  and  develop  resistance  to  trans¬ 
forming  growth  factor- p.  These  cells  were  effectively  immortal  and 
could  be  propagated  indefinitely.  In  one  culture,  passage  through 


Figure  5  Relative  lengths  of  telomeres  in  vivo  a nd  in  vitro  were  assessed  using  dual-color  FISH,  (a)  Relative  lengths  of  telomeres  in  histologically  defined 
stages  of  breast  cancer  progression:  three  samples  of  normal  epithelium  (NE),  three  samples  of  UDH,  one  sample  of  atypical  ductal  hyperplasia  (ADH),  five 
samples  of  DCIS  and  three  samples  of  invasive  cancer  (1C).  Error  bars  indicate  the  standard  deviation  of  all  images  in  each  histological  type,  (b)  Left, 
telomere  length  comparisons  in  three  different  passages  in  HMECs  (prestasis  finite  lifespan  184,  p5;  post-selection  finite  lifespan  184,  pi 2;  and  the  p53- 
immortal  line  184AA2,  p24)  determined  using  FISH  (yaxis)  and  the  telomere  restriction  fragment  (TRF)  Southern-blot  analysis  Uaxis)  as  described 
previously26.  Right,  telomere  restriction  fragment  assays  for  telomere  length  in  these  cultures. 
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Figure  6  A  schematic  representation  of  genomic  events  associated 
with  breast  cancer  progression.  These  events  are  generalizations  from 
published  studies  and  the  present  work  describing  total  genomic 
aberrations9-27-29,  telomerase  activity1 1,30f  telomere  length10  and  genome 
instability.  Individual  tumors  and  lesions  may  vary  substantially  from  this 
evolutionary  pattern. 


telomere  crisis  occurred  at  passage  22  (p22;  Fig.  3b,c),  as  documented 
by  maximal  nuclear  bridge  frequency  and  reactivation  of  telomerase 
(Fig.  4).  Instability  was  low  before  p22  (cjcn  =  ~0.6),  maximal  at 
p22  (ctcn  =  ~  1.5)  and  significantly  lower  thereafter  (ctcn  =  ~0.75; 
P  <  0.04,  Wilcoxon  rank  test;  Fig.  3c).  Genomic  aberrations  detected 
by  array  CGH  were  not  apparent  before  p22,  increased  substantially  by 
p28  and  changed  slowly  thereafter  (Fig.  3b).  Telomere  length 
decreased  during  and  after  transition  through  telomere  crisis,  as 
observed  in  vivo  (Fig.  5b).  We  obtained  similar  results  for  other 
cultures,  but  the  time  of  passage  through  telomere  crisis  and  the 
spectrum  of  genomic  aberrations  detected  using  comparative  genomic 
hybridization  (CGH)  differed  (data  not  shown). 

The  genomic  events  associated  with  transition  through  telomere 
crisis  in  ZNF21 7-transfected  HMECs  were  remarkably  similar  to  those 
observed  in  breast  cancer  during  transition  from  UDH  to  DCIS 
(Fig.  6).  Telomere  length  decreased  steadily  during  and  after  transition 
through  telomere  crisis,  as  observed  during  progression  from  UDH  to 
invasive  cancer.  Genome  instability  and  the  frequency  of  anaphase 
bridges  were  low  before  telomere  crisis,  as  found  for  UDH;  highest 
during  crisis,  as  found  for  DCIS;  and  somewhat  reduced  thereafter,  as 
found  for  invasive  cancer.  Few  genome  aberrations  were  apparent 
before  transition,  as  observed  in  UDH;  the  number  increased  sharply 
shortly  after  telomere  crisis,  as  found  in  DCIS;  and  the  number 
changed  slowly  thereafter,  as  found  for  invasive  cancer9,14,15.  Many 
of  the  aberrations  detected  in  post-crisis  HMECs  involved  regions  that 
are  recurrently  aberrant  in  primary  tumors16,  suggesting  that  these 
result  from  active  selection.  Analyses  of  HMEC  transitions  though 
telomere  crisis  show  that  the  probability  of  successful  transition  is 
extremely  low12.  This  may  explain  why  the  risk  of  developing  invasive 
cancer  in  individuals  with  UDH  is  only  modestly  increased17. 

The  low  probability  of  transition  through  telomere  crisis  also 
suggests  that  the  DCIS  lesions  that  do  form  arise  from  single  cells 
in  which  telomerase  is  reactivated  and  that  carry  aberrations  that 
enable  further  progression  toward  malignancy1.  This  post-transition 
immortal  cell  and  its  progeny  would  have  the  features  ascribed  to 
tumor  stem  cells18:  active  telomerase  and  the  ability  to  propagate 
indefinitely.  In  vitro  and  xenograft  experiments  (data  not  shown), 
however,  suggest  that  the  post-transition  immortal  cells  remain  some¬ 
what  unstable  and  so  produce  the  aberrant  progeny  detected  using 
FISH  in  invasive  cancers.  The  fact  that  genomically  unstable  cell 


cultures,  xenografts  and  invasive  cancers  evolve  slowly,  as  measured 
using  CGH,  suggests  that  the  aberrant  progeny  are  at  a  proliferative 
disadvantage  relative  to  cells  that  divide  normally  (Supplementary 
Fig.  3  online).  This  may  explain  why  most  apparently  unstable 
primary  and  metastatic  tumors  evolve  relatively  few  new  genomic 
aberrations  that  can  be  detected  using  CGH. 

Positioning  of  immortalization  and  telomere  crisis  between  UDH 
and  DCIS  in  breast  cancer  has  important  clinical  implications.  For 
example,  assays  that  measure  telomerase  activity,  genome  instability 
and  the  presence  of  recurrent  aberrations  will  identify  lesions  that  have 
transitioned  through  this  checkpoint  and  are  at  increased  risk  of 
further  progression.  In  addition,  agents  that  enhance  damage  surveil¬ 
lance19,  inhibit  reactivation  of  telomerase20  or  poison  cells  with  active 
telomerase21  may  be  effective  in  preventing  the  rare  UDH-to-DCIS 
transition  and  hence  may  be  useful  preventive  agents. 

METHODS 

Tissues  and  sections.  We  cut  frozen  sections  from  three  UDH  cases,  three 
DCIS  specimens,  four  invasive  cancers  and  two  normal  breast  tissues,  stained 
them  with  hematoxylin  and  eosin  and  examined  them  to  identify  regions  of 
histological  interest.  We  cut  adjacent  ~  30- pnv  thick  sections  attached  to  tape 
(Instrumedics),  placed  them  on  coated  glass  slides  and  attached  them  by 
polymerization  through  exposure  to  380  nm  ultraviolet  irradiation  for  60  s.  We 
stored  slides  at  -80  °C  until  needed. 

FISH  to  thick  tissue  sections.  We  carried  out  FISH  to  thick  sections  as  des¬ 
cribed  previously13  and  in  Supplementary  Methods  online.  We  mounted 
sections  on  slides,  fixed  them  in  acetone,  digested  them  with  pepsin,  fixed  them 
again  in  a  methanol-acetone  solution,  air-dried  them,  denatured  them  and 
hybridized  them  for  2-3  d  to  denatured  probes  (a  Cy3-labeled  alpha -satellite 
probe  (pUC1.77)  for  lcen  and  an  Alexa  568-labeled  ~250-kb  BAC/Pl  contig 
containing  ZNF217  at  20ql3.2)  plus  excess  salmon  sperm  DNA  and  Cot-1  DNA. 
We  counterstained  hybridized  slides  with  YO-PRO-1  and  DAPI  in  antifade. 

3D  image  acquisition.  We  acquired  images  of  thick  tissue  sections  using  a  laser 
scanning  confocal  microscope  (Model  LSM410,  Carl  Zeiss)  equipped  with  a 
63x  ,  1.3  NA  plan-apo  objective  lens  as  described22.  YO-PRO-1,  Alexa568  and 
CY5  were  excited  at  488  nm,  568  nm  and  633  nm,  respectively,  and  emitted 
light  was  collected  at  515-565  nm,  575-640  nm  and  >665  nm,  respectively.  We 
selected  regions  for  3D  image  acquisition  based  on  histological  assessment  of 
two-dimensional  (2D)  YO-PRO-1  images.  3D  images  were  typically  512  x 
512  x  100  voxels  and  had  x,  y,  z  voxel  dimensions  of  0.2  pm,  0.2  pm  and 
0.3  pm,  respectively. 

Nuclear  segmentation.  We  segmented  the  3D  images  of  YO-PRO-1 -stained 
nuclei  as  described22.  The  3D  images  were  automatically  divided  into  high 
(nuclear)  and  low  (background)  intensity  regions  and  visually  assessed  to 
detect  clusters  of  nuclei  and  to  discard  incomplete  nuclei  and  debris.  Ousters 
were  automatically  subdivided  and  reassessed  visually.  We  summed  the  voxels 
comprising  each  segmented  nucleus  as  an  estimate  of  nuclear  volume,  which 
we  used  as  an  estimate  of  total  DNA  content  in  subsequent  analyses. 

Detection  of  FISH  signals.  We  detected  FISH  signals  in  each  segmented 
nucleus  by  applying  a  modified  morphological  top-hat  method  to  detect  high 
intensity,  punctate  signals  smaller  than  a  predefined  size.  We  added  gray-scale 
reconstruction  to  exclude  signals  smaller  than  expected  for  true  hybridization 
signals  and  used  an  automatic  gray-weighted  threshold  to  select  only  high- 
intensity  signals.  We  used  Gaussian  filtering  before  thresholding  to  merge 
closely  spaced  doublet  signals.  We  used  the  numbers  of  Alexa  568  and  Cy5 
signals  in  each  3D  nuclear  volume  as  measures  of  the  numbers  of  copies  of  1  cen 
and  20ql3.2  in  that  nucleus. 

Telomere  length  measurements.  We  used  FISH  with  Cy3-labeled  pan-telo- 
meric  and  fluorescein  isothiocyanate-labeled  pan-centromeric  peptide  nuclei 
acid  probes  (Applied  Biosystems)  to  label  centromeres  and  telomeres  in  5-pm- 
thick  sections  of  formalin-fixed  paraffin-embedded  tissues.  We  treated  deparaffi- 
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nized  tissues  in  1M  sodium  thiocyanate  at  80  °C  for  8  min  and  denatured  them 
in  70%  formamide  at  73  °C  for  5  min.  We  then  washed  sections,  stained  them 
with  DAPI  and  mounted  them  using  anti-fade  buffer.  We  acquired  Cy3,  fluore¬ 
scein  isothiocyanate  and  DAPI  images  using  a  scanning  confocal  microscope 
(Zeiss  LSM510).  See  Supplementary  Methods  online  for  additional  details. 

We  used  custom  software  to  segment  nuclei  in  regions  of  histological  interest 
and  to  measure  average  centromere  and  telomere  FISH  intensities  per  nucleus. 
We  estimated  relative  telomere  length  as  the  ratio  of  the  total  telomere  signal 
intensity  to  the  total  centromere  intensity.  We  analyzed  at  least  500  nuclei 
per  section. 

Error  assessment.  We  assessed  the  accuracy  of  the  copy  number  enumeration 
by  fitting  univariate  copy  number  frequency  histograms  with  the  function 


Prob(x\m,p,n)  =  £  (")  •  (1  -p)‘  V""0 

r=max(n— x.0)  '  ' 


„(*  +  »-«) 


(x  +  i  —  fi)l 


where  Prob  (x  |  m,p,n)  is  the  probability  of  detecting  x  signals  in  a  nucleus, 
assuming  that  spurious  spots  were  Poisson-distributed  with  mean  m,  that  the 
probability  of  a  true  genomic  target  being  detected  was  p,  and  that  there  were  rt 
genuine  genomic  targets  per  nucleus. 


Correlation  of  copy  number  with  nuclear  volume.  We  used  linear  regression 
analysis  to  assess  correlations  between  the  number  of  copies  of  lcen  and  the 
number  of  copies  of  20ql3.2,  and  between  the  sum  of  the  number  of  copies  of 
lcen  and  20ql3.2  and  nuclear  volume. 


Anaphase  bridge  analysis.  We  scored  anaphase  bridges  and  mitoses  in  ten 
high-power  fields  (40x  NeoFluar,  Model  Axioplan2,  Carl  Zeiss).  We  scored 
nuclear  bridges  in  300  nuclei  of  ZNF2I7-immortalized  HMECs  stained  with 
DAPI  as  surrogates  for  anaphase  bridges. 


CGH  and  FISH  analyses  of  HMECs.  We  grew  ZNF2 /7-transduced  HMECs  as 
previously  described12  and  collected  them  at  several  passages  for  genomic 
analyses.  We  analyzed  genome  copy-number  changes  using  array  CGH  as 
described  previously23.  We  labeled  DNAs  from  HMECs  and  normal  cells  with 
Cy3  and  Cy5,  respectively,  and  hybridized  them  for  2  d  along  with  excess  Cot-1 
DNA  to  microarrays  carrying  BAC  DNA  targets  distributed  at  ~  1-Mb  intervals 
along  the  genome.  We  stained  arrays  with  DAPI  after  hybridization.  We 
acquired  DAPI,  Cy3  and  Cy5  images  using  a  CCD  camera  system  and  analyzed 
them  as  described24  to  obtain  relative  copy  number  for  each  target  We 
plotted  log2  Cy3:Cy5  mean  intensity  ratios  according  to  location  along  the 
genome  (University  of  California  Santa  Cruz  human  genome  assembly,  June 
2002  freeze). 

We  carried  out  dual-color  FISH  analyses  of  HMECs  using  probe  combina¬ 
tions  of  lcen/20ql3.2  or  2p24/2q36  as  described25.  Probes  in  each  pair  were 
labeled  with  Alexa  488  and  Cy3,  respectively.  We  scored  200  nuclei  for 
each  analysis. 


Note  Supplementary  information  is  available  on  the  Nature  Genetics  website. 
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